J 



Europaisches Pat ntamt 

® fljjg European Patent Off ic @ Publication number: 0020 251 

Office europden des brev ts ^2 



© EUROPEAN PATENT APPLICATION 

® Application number: 80400722.7 @ Int. CI* : C 1 2 N 1 5/00, C 1 2 P 21 /02, 

^ A 61 K 39/29 

® Date of filing: 23.05.80 



© Priority: 24.05.79 US 41909 
26.12.79 US 107267 


® 


Applicant: THE REGENTS OF THE UNIVERSITY OF 
CALIFORNIA, 2200 University Avenue, Berkeley, 
California 94720 (US) 


@ Date of publication of application : 1 0.1 2.80 
Bulletin 80/25 


® 


Inventor: Rutter, William J., 80 Everson Street, San 
Francisco California 94131 (US) 

Inventor: Goodman, Howard Michael, 3006 Clay Street, 
San Francisco California 94127 (US) 


@ Designated Contracting States : AT BE CH DE FR GB IT 
LILUNLSE 


® 


Representative: Martin, Jean-Jacques et al, Cabinet 
REGIMBEAU 26, Avenue Kleber, F-75116 Paris (FR) 



DNA transfer vector, host transformed with it, vaccine, and their production. 



@ DNA transfer vector comprising the genome of a non- 
passage able virus. The vector can be a plasmid and the vims 
hepatitis B virus. The production of antigenic protein by the ex- 
pression of the recombinant DNA in a host cell. The enhancing 
f the antigenicity and the stability of the antigenic protein by 
chemical modification. 
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NON-PASS^^ABI^; VIRUSES 

BACKGROUND AND PRIOR ART 

The present invention relates to the study 
of virus-caused diseases. In particular, the 
invention relates to viruses that with current tech- 
nology fail to multiply in cultured cells or embryonic 
tissues, and hence cannot be produced in quantity. 
Sometimes they do not produce recognizable cytopath- 
ology. Therefore their biological effects have been 
difficult to study. For the most part, such viruses 
can only be obtained from humans accidentally or 
voluntarily infected or from infected higher primates , 
only occasionally can they be obtained from infected 
lower species. Such viruses are termed herein non- 
15 passageable viruses, or NP-viruses, in recognition of 

the fact they either cannot be maintained or replicated 
by passage through tissue culture cells, embryonic 
tissues, or lower organisms or that it is difficult or 
impractical to do so. The diseases caused by such 
20 viruses may have long latent periods, and sometimes 

result in derangement of the patient's immune system or 
in carcinogenic transformation. Examples of such 
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viruses include the Hepatitis B Virus (HBV) , the 
"slow" viruses such as the causative agent of kuru, 
and the viral agent implicated in the etiology of 
multiple sclerosis, and the xenotropic viruses, such 
5 as the C-type particles implicated in the causation of 
certain tumors • NP viruses may be associated with 
chronic crippling or wasting diseases, or with cancer. 
In one case, HBV, there is evidence for dual patho- 
genicity, inasmuch as there is strong evidence linking 

10 this virus to primary carcinoma of the liver as well 
as to hepatitis. 

In view of the serious and insidious health 
hazard presented by NP viruses, there is a need of a 
biological system of general utility to enable research 

15 on these viruses to go forward. Such a system will open 
an entire new research field and will provide means 
for the production of genetically pure viral antigens 
and antibodies thereto and permit production of viral 
components in desired amounts. The present invention 

20 provides such a biological system of general utility 

for enabling a vast amount of research which is current- 
ly impossible due to the nature of NP viruses. The 
system is also useful for the study of passageable 
viruses, offering the advantages of reduced biohazard, 

25 the capability to synthesize and modify specific 
virus-coded proteins, and to obtain quantities of 
viral DNA and virus-coded proteins sufficient for 
chemical and biochemical analysis, and for the pro- 
duction of vaccines. The nature of the system and the 

30 practice of the invention have been demonstrated with 
HBV. Further background relating to HBV, and the 
terminology employed in the art, will be discussed, 
infra . 

Until recently, hepatitis has been a disease 
35 characterized primarily in terms of its symptoms and 
epidemiology. In 1967, Bloomberg and co-workers first 



BNSDOCID:<EP I > 



0020251 



-3- 00015* 

described an antigen associated with infection by 
hepatitis type B . [See, Blumberg, B.S., Science 
197 , 17 (1977) ] . Since then, extensive research 
has contributed a wealth of information about the 
disease. The causative agent is a DNA virus known as 
Hepatitis B Virus (HBV) . The serum of infected 
patients contains a variety of particle types associ- 
ated with infection. The whole virus particle is 
believed to be essentially spherical and 42 nm in 
diameter, comprising an envelope, a core and DNA, and 
termed the "Dane" particle, after its discoverer 
[Dane, D.S. et al.. Lancet 1970-1 , 695 (1970)]. The 
envelope contains the surface antigen (HBsAg) , dis- 
covered by Blumberg. The core contains an immuno- 
logically distinct antigen, HBcAg . The DNA isolated 
from Dane particles is circular and contains varying 
length single-stranded regions, Summers, J. et al., 
Proc. Nat. Acad. Sci USA 72 , 4597 (1975) ; Landers, T.A. 
et al., J.Virol. 23 , 368 (1977); Fritsch, A. et al., 
C.R.Acad. Sci. Paris D 287 , 1453 (1978) . The surface 
antigen is found in the serum of persons infected with 
HBV and in certain carrier states. Antibodies to HBsAg 
are found in the serum of patients who have been 
infected with HBV. Antibodies to the core antigen are 
5 also found in certain carrier states. A radio- 
immunoassay has been developed for HBsAg, Ling, CM- 
et al.. J. Immunol. 109 , 834 (1972), and for anti-HBsAg, 
Hollinger, F. et al., J. Immunol. 107 , 1099 (1971). 

The HBsAg is an immunochemically defined 
0 material associated with the envelope of the virus. 
Previous studies indicate that HBsAg comprises several 
components of varying antigenicity, including both 
glycosylated and non-glycosylated proteins as major com- 
ponents (Peterson, D. L., et al. , Proc. Nat. Acad. Sci. 
5 U.S.A. 74 , 1530 (1977) ;. Peterson, D.L. , et al. in 

Viral Hepatitis. A contemporary Assessment of Etiology,. 



0020251 

-4- 00015* 

Epidemiology , Pathogenesis and Prevention (G.N, Vyas, 
S.N. Cohen and R. Schraid, eds.), pp. 569-573, 
Franklin Institute Press, Philadelphia, 1978) . In 
addition, lipid and several additional protein com- 
5 ponents have been reported to be present in surface 
antigen preparations, Shi, J.W.K. and Gerin, J.L,, 
J* Virol. 21 , 347 (1977) . The major protein components 
were reported as having molecular weights (M.W.) of 
22,000 and 28,000 daltons for the non-glycosylated and 
10 glycosylated proteins, respectively, based upon sodium 
dodecyl sulfate (SDS) , gel electrophoresis, Peterson, 
et al. (1977) , supra . An N- terminal sequence of 
9 amino acids of the 22,000 M.W. protein, isolated from 
plasma of a human carrier of HBsAg by preparative SDS 
15 gel electrophoresis was reported to be Met-Glu-Asn-Ile- 
Thr-(ser) or (Cys) -Glyo-Phe-Leu (Peterson, et al., 1977, 
supra . J 

Standard abbreviations are used herein to 
denote amino acid sequences : 
20 Ala = Alanine Cys = Cysteine 

Gly = Glycine His « Histidine 

Glu = Glutamic acid Lys = Lysine 

Gin = Glutamine Leu = Leucine 

Asp = Aspartic acid lie = Isoleucine 

25 Asn = Asparagine Val = Valine 

Arg = Arginine M or Met = Methionine 

Ser = Serine Tyr « Tyrosine 

Thr = Threonine Phe = Phenylalanine 

Trp = Tryptophan Pro = Proline 

30 All amino acids are in the L-conf iguration unless stated 
otherwise. In some instances herein, methionine is 
designated by M to signify its potential role in trans- 
lation initiation. An N- terminal sequence of 19 amino 
acids for a protein similarly isolated was reported to 
35 be: Met-Glu-Asn-Ile-Thr-Ser-Gly-Phe-Leu-Gly-Pro-Leu- 
Leu-Val-Ser-Gln-Ala-Gly-Phe. (Peterson, et al., 1978, 
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supra) . The ncn-glycosylated protein was reportedly 
immunogenic, but the glycosylated peptide, isolated 
as described by Peterson et al. f 1977, supra, was not. 
However , other workers have reported a glycosylated 
peptide component which was immunogenic, Gerin, J.L. 
et al., in Viral Hepatitis , supra , pp. 147-153 (1978). 
The discrepancy has not been fully explained. It is 
known that the immunogenicity of the surface antigen 
proteins is sensitive to conformation changes. 
Possibly the use of detergents in the isolation and 
purification of surface antigen proteins from serum 
or plasma leads to diminished immunological reactivity. 

The ability to detect the surface and core 
antigens has proven of great clinical value, especially 
for the screening of potential blood donors, since 
transfusion is one of the more common modes of HBV 
transmission in developed countries. Presently avail- 
able sources of Dane particles for partially purified 
HBsAg limit the quality and quantity of antibody which 
can be produced. The virus cannot be grown in culture 
and can only be obtained from infected human patients 
or after infection of higher primates. Therefore, 
there is no means for maintaining stocks of HBV or for 
obtaining desired amounts of the virus or any of its 
components. The virus exerts no cytopathic affects on 
cultured cells or tissues, so that no means for measure- 
ment of infected virus particles is currently available. 
Genetically pure HBV stocks have not been available 
prior to the present invention. These limitations 
severely restrict efforts to provide HBsAg in improved 
amount and quality for the production, of antibody 
suitable for more sensitive immunoassay, for passive 
immunization, and antigen for active immunization. 
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Furthermore, the inability to passage the virus out- 
side of humans or higher primates makes it impossible 
to obtain sufficient antigen for the production of a 
vaccine. The limited host range of HBV and its 
failure so far to infect tissue culture cells have 
drastically restricted study of the virus and have 
hindered development of a vaccine for the serious 
diseases that it causes. 

Recent evidence strongly indicates a link 
between HBV and primary hepatocellular carcinoma. 
Epidemiological studies have indicated a high 
correlation of HBsAg or HBcAg in patients with primary 
hepatocellular carcinoma, Trichopoulos, D, et al. , 
Lancet , 1978, 8102. More significantly, a strain of 
cultured hepatocellular carcinoma cells ("Alexander" 
cells) is known to produce HBsAg. These cells there- 
fore contain at least part of the HBV genome. Further 
elucidation of the role of HBV in hepatocellular 
carcinogenesis and the molecular mechanisms of the 
carcinogenic transformation depends upon the develop- 
ment of suitable biological systems for maintenance 
and manipulation of the virus or its genome. 
SUMMARY OF THE INVENTION 

The invention provides, for the first time, 
a biological system for maintaining, modifying and 
replicating a genetically pure stock of an NP virus 
genome or a fragment thereof. The system provides 
means for making genetically pure viral components, 
such as coat and core proteins suitable for vaccines 
and for making viral DNA for use in studying the 
molecular biology of the viral infection and repli- 
cation process. The latter is especially valuable 
because of its significance in understanding the in- 
duction of the chronic diseases NP-viruses typically 
cause, including certain auto-immune diseases and 
certain types of cancer. 
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The present invention is exemplified by 
the cloning and expression of HBV-DNA. Novel DNA 
transfer vectors are provided containing both the 
entire HBV genome and portions thereof. The transfer 
5 vectors are used to transform a suitable host/ thereby 
permitting replication of the cloned viral DNA, or 
portions thereof, and also permitting the biological 
synthesis of viral proteins, including an immuno- 
logically active protein constituent of HBsAg, in 

10 desired amounts. An immunologically active protein 

constituent of HBsAg is useful as a vaccine for active 
immunization, and for the production of antiserum 
which in turn is useful for clinical screening tests 
and for providing passive immunity. A purified 

15 immunologically active protein constituent of HBsAg, 
designated the S protein, and fusion proteins thereof 
with a procaryotic protein fragment have been synthe- 
sized by a microorganism. The S-protein and deriva- 
tives thereof are useful as antigens to make a vaccine 

20 against HBV. 

A novel DNA transfer vector comprising the 
entire HBV genome and a microorganism transformed 
therewith were placed on deposit in the American Type 
Culture Collection, 12301 Parklawn Drive, Rockville, 

25 Md. 20852, on May 23, 1979 in conjunction with the 
filing of the parent application. The deposited 
transfer vector is that designated pEco63 herein, 
with ATCC accession number 40009. The deposited micro 
organism E. coli HB101/pEco63, has ATCC accession 

30 no. 31518. 

DETAILED DESCRIPTION OF THE INVENTION 

A novel biological system is provided for 
maintaining, replicating and modifying an NP-viral 
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genome or cDNA thereof. The system is a combination 
of methods and compositions of matter that render 
NP-viruses amenable to a variety of research activities 
The principal limitation is that at least a portion 
of the viral genome be isolatable either as viral 
genetic material or as viral mRNA. In general , the 
method entails isolating and purifying the viral 
genome or portion thereof, recombining the isolate 
with a DNA transfer vector , and transferring the 
transfer vector to a suitable host cell wherein the 
transfer vector is replicated and its genes expressed. 
Novel transfer vectors are thereby produced comprising 
all or part of the viral genome. 

The NP viral genome may be either DNA or RNA. 
In the case of DNA, the entire genome or a fragment may 
be recombined directly with a transfer vector. In 
some circumstances, viral mRNA may be isolated from 
tissues or cells oi. infected individuals, whereby it 
would be possible to synthesize a cDNA copy of the 
viral mRNA. The cDNA would then be recombined with a 
DNA transfer vector. In the case of an RNA virus, cDNA 
reverse transcripts of the viral genome are readily 
obtainable, and would then be recombined with a DNA 
transfer vector. 

Copies of the viral DNA, replicated in host 
cells descended from a single cell and containing a 
single copy of the viral genome or genomic fragment, 
are identical in sequence to the original copy and are 
therefore clones of the viral genome, or fragment. 
Expression of cloned viral DNA is accomplished by a 
variety of in vivo and in vitrp methods. Expression in 
procaryotic host cells is accomplished by inserting the 
viral DNA in the middle of a translatable transfer 
vector gene, in proper orientation and reading frame, 
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such that read-through translation or re-initiation 
translation occurs. In vitro translation can be 
carried out using known methods for DNA-directed 
protein synthesis [Zubay, G. , Ann . Rev, Genetics 7 , 
267 (1973) ] . Where nontranslated intervening 
sequences are encountered , see, e.g., Crick, F.H.C., 
Science 204 , 264 (1979) , suitable eucaryotic host 
cells capable of correctly translating genes of this 
type may be chosen for the purpose of obtaining ex- 
pression. 

Further details of the system are described 
with reference to the cloning of HBV-DNA. The cloning 
of other NP-viruses will differ in respect to details 
and variations known in the art. For example, it will 
be understood that the selection of preferred re- 
striction endonucleases for a given virus will be a 
matter of ordinary skill. Similarly, the choice of 
transfer vectors and host cells will be based on 
principles known in the art. 

HBV-DNA may be obtained from Dane particles 
which are present in the plasma of certain human HBsAg 
carriers. Dane particles may be partially purified by 
differential centrif ugation. Since much of the DNA 
extracted directly from Dane particles contains single- 
stranded regions, the DNA is initially repaired by 
filling the single-stranded gaps. A conventional DNA 
polymerase reaction may be employed, acting upon DNA 
extracted from the Dane particles. However, the pre- 
ferred method is to exploit a DNA polymerase activity 
that is endogenous in the particles themselves, as 
described by Hruska, J.F. et al. , J.Virol. 21 , 666 
(1977) . In the preferred method, the DNA is first 
repaired, then extracted from the particles. If 
desired, radioactive label may be incorporated during 
the polymerase reaction. 
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For the purpose of cloning, the circular 
HBV-DNA. must be cleaved internally at one or more 
sites to enable its subsequent covalent attachment 
to a DNA transfer vector. The attachment process is 
catalyzed by a DNA-ligase enzyme and is termed 
ligation. The internal cleavage may be carried out 
using non-specific endonucleases, many of which are 
known in the art, which catalyze the hydrolysis of 
the phosphodiester bonds of DNA at random sites on the 
DNA. Preferably, however, the cleavage should be 
carried out using one or more restriction endonucleases , 
which catalyze the hydrolysis of only those phosph- 
diester bonds located within certain deoxynucleotide 
base sequences known as restriction sites. See, 
Roberts, R. , Crit. Rev. Biochem. 4 , 123 (1976). A 
wide variety of restriction endonucleases is commer- 
cailly available. The existence of a given restriction 
site in a given segment of DNA the size of the HBV 
genome is largely a matter of chance. Some sites may 
be frequently encountered, others not at all. We have 
found that HBV-DNA contains a single site for the 
restriction endonuclease EcoRI. Digestion of HBV-DNA 
by EcoR I converts the circular DNA to linear DNA without 
significant alteration of molecular weight • As a con- 
sequence of using a restriction endonuclease, all the 
linear digestion products have the same base sequence 
at their ends. Similarly, digestion by the enzyme 
BamH I produces two linear DNA fragments, which can be 
fractionated according to molecular length by gel 
electrophoresis. Digestion with both enzymes, EcoR i 
and BamH I, will produce three linear DNA fragments 
whose sizes determined by gel electrophoresis will 
permit certain inferences as to the relative locations 
of the EcoR I site and the two BamH I sites. By 
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analyzing the effects of various combinations of 
restriction endonucleases on the sizes of fragments 
produced, it is possible to construct a restriction 
map of HBV-DNA which shows the relative locations of 
restriction sites with respect to each other. Such 
a map is shown in FIG. 1 for HBV-DNA. 

By appropriate choice of restriction endo- 
nucleases, it is possible to transfer the entire 
genome of HBV, or any segment and overlapping com- 
binations of segments, to a DNA transfer vector capable 
of replicating the transferred HBV-DNA in a suitable 

host organism. 

The choices of transfer vector and host are 
interrelated and governed by certain practical con- 
siderations such as the desired end use and the 
relevant bio-hazard. Tor virus particle synthesis or 
for maximal rates of expression in some instances, 
eucaryotic host cells may be more suitable. The trans- 
fer vectors chosen must be capable of entering and 
replicating in the host. For rapid DNA replication, 
ease and safety of handling, for preservation of 
genetic purity and for pilot studies, a microbial host 
such as Escherichia coli is preferred. Numerous DNA 
transfer vectors are known for E. coli . Plasmid 
transfer vectors have been employed herein, merely for 

convenience . 

Attachment of the HBV-DNA to a transfer 
vector requires opening the transfer vector circular 
DNA, preferably at a given site, followed by ligation 
of the linear HBV-DNA with the linear transfer vector 
DNA to form a circular recombinant transfer vector 
containing the HBV-DNA inserted in its nucleotide 
sequence at the site where it was originally cleaved. 
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Preferably, for recovery of the inserted DNA sequent, 
subsequent to amplification, the ends of the transf r 
vector DNA and HBV-DNA are treated to provide a 
specific means for specifically removing the HBV-DNA 
5 from the recombinant transfer vector. One method of 
treatment entails the addition of double-stranded 
oligodeoxynucelotide "linker" molecules whose base 
sequence includes one or more restriction site sequences, 
Scheller, R. H. et al. , Science 196 , 177 (1977). A 

10 second method, termed "tailing", involves addition of 
oligo-G and oligo-C sequences at the ends of the endo- 
nuclease- treated plasmid, and viral DNAs, respectively, 
in a reaction catalyzed by terminal transferase. (It 
will be understood that base sequences in DNA refer 

15 to deoxyribonucelotides , while base sequences in RNA 

refer to ribonucelotides . ) At the point of joining, a 
GGCC sequence is generated, which is a restriction 
site sequence specific for Haelll. The inserted segment 
may be released from the plasmid by digestion with 

20 Haelll. The inserted segment may be released from the 

plasmid by digestion with Hael ll [see, Villa-Komarof f , L. 
et al., Proc .Nat . Acad . Sci . USA 75, 3727 (1978)]. The 
linker method enables the sequence at the joint between 
the two DNAs to be precisely defined. The tailing 

25 method produces a family of joined molecules. There is 
a one-third probability that a given clone, joined by 
tailing, will have the same translational reading frame 
as the transfer vector gene to which it is joined, 
which enables expression of the cloned gene by read- 

30 through translation from the transfer vector gene. 

There is also one-half probability that the inserted 
DNA will be joined in the same translation orientation, 
so that the composite probability that a given clone 
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can be expressed is 1/6, see, Polisky, B., et al. , 
Proc. Nat. Acad. Sci. USA 73, 3900 (1976); and, 
Itakura, K. et al. , Science 198 , 1056 (1977). 
Tailing is therefore preferred where expression is 
desired in the absence of evidence that the vector 
and the insert are in phase with respect^to reading 
frame. 

Transfer of the recombinant transfer vector 
to the desired host is accomplished by means appro- 
priate to the individual host-vector pair. Plasmids 
are generally transferred to a microorganism host by 
transformation. The vector-containing host replicates 
the transfer vector in keeping with its own cell 
division with the result that proliferation of the 
host cells results in concomitant multiplication of the 
recombinant transfer vector. Host cells containing 
a particular recombinant insert can be identified by 
appropriate selection means. For example, insertion 
of an exogenous DNA fragment at the PstI site of 
plasmid pBR322 interrupts the gene conferring 
ampicillin resistance, so that host bacteria trans- 
formed by recombinant plasmids fail to be ampicillin 
resistant. Non- trans formed cells can be screened by 
an appropriate transfer vector marker gene that is 
not affected by the insertion. The descendants of a 
single host cell containing a recombinant transfer 
vector are properly termed a clone of that cell strain. 
The inserted DNA segment carried by the transfer 
vector is thereby cloned. All copies derived therefrom 
have identical base sequences except for extremely 
rare random mutational changes. Host cells containing 
a recombinant transfer vector serve as an essentially 
inexhaustible source of supply for the cloned DNA. 




0020251 

-14- 00015Y 

Expression of the cloned DNA may be 
manifested by transcription, synthesis of mHNA 
corresponding to the cloned DNA # or by translation, 
synthesis of protein coded by the mKNA transcribed 
5 from the cloned DNA. The occurrence of transcription 
expression may be detected by the appearance of SNA 
capable of hybridizing specifically with the cloned 
DNA- Translation expression may be detected by the 
appearance of a function specific for the expected 

10 protein. For example, such a function may be an 

enzyme activity, a hormonal activity or an immuno- 
logical specificity, that is characteristic of the 
protein coded by the cloned gene. In the case of 
viral gene products, the appearance of an immuno- 

15 logically reactive protein, such as HBsAg or HBcAg 
in the case of HBV, is the most likely possibility. 
Other sorjfcs of specific binding reactions may be 
- appropriate in certain circumstances. A sensitive 
in situ solid-phase radioimmunoassay has been developed 

20 for detecting expression from single colonies of 

transformed bacteria, Villa-Komarof f , L.et al., supra . 

The above-described biological system for 
maintaining, replicating and synthesizing virus com- 
ponents provides for the first time a means for conduct- 

25 ing clinical, biochemical and genetic research on 

viruses which can only be detected, directly or in- 
directly, in infected humans or higher primates. Such 
viruses, termed NP-viruses herein, include, but are not 
limited to, the Hepatitis B Virus, the "slow viruses" 

30 such as kuru and the agent implicated in the etiology 

of multiple sclerosis, and the xenotropic viruses, such 
as the C-type particles implicated in the causation 
of certain tumors. Little is presently known about such 
viruses, because of the lack of a suitable biological 
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system for conducting experiments • Their public 
health significance cannot be underestimated, since 
their mechanism of action bears directly upon the 
mechanisms of cancer induction and on the develop- 
5 ment of auto- immune diseases. The present invention 
opens an entire new field for clinical, biochemical/ 
immunological and genetic research on virus-related 
diseases. The system provides the following capabili- 
ties : the viral genome can be maintained and repli- 

10 cated in genetically pure form. Nucleotide sequence 

data can be obtained which will provide full information 
on the amino acid sequences of viral proteins, when 
correlated with information obtained by direct amino 
acid sequencing. Paradoxically, nucleotide sequences 

15 are easier to determine than amino acid sequences. 

Partial amino acid sequences, particularly at the ends 
of proteins , are useful to help establish^ starting 
points and reading frames. Labeled viral genetic 
materials can be used in hybridization experiments to 

20 locate and quantitate viral insertions in infected cell 
genomes. The viral proteins can be expressed in host 
cells, thereby permitting their characterization, pro- 
duction of antibodies against them, development of 
assays for their detection and measurement and the 

25 preparation of adducts and derivatives thereof. 

Vaccines can be prepared from the viral proteins. Such 
vaccines can be made available in the needed quantities 
and provide a substantial safety factor, since 
vaccines can be made by the described methods free of 

30 any contamination by intact or infectious virus 

particles. Antibodies against viral proteins are 
useful for clinical diagnosis of viral infection. The 
ability to make viral proteins in quantity makes it 
possible to study their biochemical characteristics and 

35 
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modes of action in contributing to the viral 
pathogenesis. The foregoing capabilities are 
illustrative only of the immediate benefits of the 
research made possible by the present invention. 
Longer term findings relating to subtle or un- 
predicted phenomena may also be expected to be of 
great significance . 

The following examples are illustrative 
of the invention, as applied to HBV. The invention 
is not limited to its embodiment described in the 
examples* The system is applicable to any virus 
which cannot conveniently be maintained except by 
infection of humans or higher primates, but of which 
the genetic material, whether DNA or RNA, can be 
obtained, in whole or in part. 
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EXAMPLE 1 
Cloning a viral DNA genome 

Double-stranded circular HBV-DNA was obtained 
from Dane particles containing 25 jxg DNA, as des- 
5 cribed by Hruska, et al., supra . The DNA was initially 
screened for sensitivity to restriction endonucleases 
by gel electrophoresis of the products of enzymic 
digestion. Gel electrophoresis fractionates nucleic 
acids according to their molecular length, Helling, R. , 
10 et al. . J. Virol. 14, 1235 (1974). Treatment of 100 ng 
DNA with EcoR I endonuclease (2 units) resulted in a 
single sharp band corresponding to about 3200 base pairs 
(bp) length. Similar treatment with BamH I endonuclease 
resulted in two fragments corresponding to about 1200 
15 and 2000 pb length. Restriction endonucleases were 

obtained from New England BioLabs, Beverly, Massachusetts. 
Units are defined by the manufacturer. 'All reactions 
using restriction endonucleases were carried out in 
buffers recommended by the manufacturer. From the 
20 number of fragments obtained in each case, it was 

inferred that HBV-DNA contains a single EcoR I site and 
two BamH I sites. 

The DNA transfer vector selected was the 
plasmid pBR325 (Bolivar, F. , Gene 4, 121 (1978), 
25 which is derived from plasmid pBR322 (Bolivar, F. et al.. 
Gene 2, 95 (1977) and is capable of transforming E. coli. 
Plasmid pBR325 carries a gene conferring chloraphenicol 
resistance (Cm r ) and ampicillin resistance (Ap r ) on 
transformed cells. An EcoRI site exists in the Cm 
30 gene such that an insertion of exogenous DNA at the 
EcoRI site renders the Cm r gene inoperative while 
leaving the Ap r gene unaffected. Recombinant clones of 
transformed E. coli are identified as chloramphenicol 
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sensitive and ampicillin resistant, while non- 
transformed cells, sensitive to both chloramphenicol 
and ampicillin, fail to grow in the presence of 
either antibiotic. Clones transformed with non- 
5 recombinant pBR325 are identified as chloramphenicol 
resistant and ampicillin resistant. The microbio- 
logical methods used for growth and selection of 
recombinant strains were standard methods, described 
in Experiments in Molecular Genetics by Jeffrey H. 

10 Miller, Cold Spring Harbor Laboratory (1972) . 

For the insertion process, purified pBR325, 
50 ng, and 300 ng HBV-DNA were first treated together 
with EcoR I endonuclease, 10 units (10 [il total vol.) 
at 37 °C for one hour to yield linear plasmid DNA. 

15 The reaction mixture was heated to 65 °C for five minutes 
to inactivate EcoR I endonuclease. 

The DNA was isolated from the reaction 
mixture by two cycles of ethanol precipitation. The 
precipitate was resuspended in 10 (il H 2 0 to which a 

20 buffer concentrate was added to give 50 mM tri-HCl 

pH 8.0, 1 mM ATP, 10 mM MgCl 2 and 20 mM dithiothreitol . 
The mixture was pretreated by incubation at 37° C for 
five minutes, followed by five minutes at room temperature. 
The mixture was then cooled in an ice bath and incubated 

25 with 1 unit T4 ligase (P-L Biochemicals , 11,000 units/ml) 
at 14 °C for 15 hours. The reaction mixture was added 
directly to a suspension of E. coli cells prepared for 
transformation by standard techniques. The host cell 
strain chosen was E. coli HB101, described by Boyer, H.W. 
30 & Rolland-Dussoix, D. J. Mol. Biol . 41:459-472 (1969) . 
The choice of a particular strain was based upon con- 
venience. Strain HB101 contains no other plasmids, is 
sensitive to chloramphenicol and to ampicillin and it 
is relatively easy to grow and maintain stocks of the 
35 organism. 
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Single colonies of transformed cells 
containing a recombinant plasmid, as judged by 
chloramphenicol sensitivity and ampicillin re- 
sistance, were grown in culture to provide a source 
of plasmid DNA. Cultures were grown in L-broth at 
37 °C with aeration and harvested in late log or 
stationary phase. Alternatively, transformed cells 
were grown in a suitable minimal medium, as described 
by Bolivar, F. , et al. , supra , and Bolivar, P., 
supra , to an optical density at 660 nra of 1.0, using 
a 1 cm cuvette. Chloramphenicol, 170 p.g/ml, was 
then added and the culture was incubated overnight. 
In either case, the plasmid DNA was isolated as 
supercoils from a cell lysate, using the method of 
ethidium bromide CsCl density gradient centrif ugation 
described by Clewell, D.B. and Helinsky, D.R., Proc. 
Nat. Acad. Sci. USA 62 , 1159 (1969) . . Plasmid DNA 
prepared from transformed cells was treated with 
EcoRI endonuclease and fractionated by gel electro- 
phoresis, as described. Single colonies were screened 
by the toothpick assay described by Barnes, W^M. , 
Science 195 , 393 (1977) , to identify those bearing 
plasmids with large inserts. Two independently isolated 
recombinant plasmids containing insertions about 1200 bp 
in length were selected for subsequent studies. These 
were designated pEco-3 and pEco-63. 

In similar fashion the BamHI fragments of 
HBV-DNA were separately cloned, using the BamH I site 
of plasmid pBR322 for insertion. Dane particle DNA 
(200 ng) , labeled with 32 P by the nick translation 
method. (Rigby, P. W. J., et al., J. Mol. Biol. 113 , 
237 (1977) was mixed with 200 ng unlabeled Dane 
particle DNA and 2 p.1 of 10-fold concentrated BamH I 
digestion buffer. The DNA was digested with 5 units 
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BamHI endonuclease for 1 hour at 37 °C. The mixture 
was heat treated at 65 °C for 5 minutes to inactivate 
the enzyme and the DNA recovered by two cycles of 
ethanol precipitation • The transfer vector, pBR322, 
was similarly digested with BamH I endonuclease and 
further treated with alkaline phosphatase as des- 
cribed by Ullrich, A. , et al. , Science 196 , 1313 
(1977) . BamH I digested Dane DNA (250 ng) was in- 
cubated with 680 ng pBR322, treated as described, 
for 15 hours at 14 °C in a reaction mixture containing 
50 mM tris-HCl, pH 8.0 r 1 mM ATP, 10 mM MgCl 2# 20 mM 
dithiothreitol and 1 unit of T4 DNA ligase, following 
a pre-heating treatment as previously described . The 
ligation reaction mixture was used to transform coli 
and transformants were selected for ampicillin re- 
sistance and tetracycline sensitivity. A recombinant 
plasmid bearing the about 2100 pb BamH I fragment was 
designated pBam-132. A plasmid bearing a smaller 
fragment about 1100 bp was also obtained, designated 
pBam-69. Since the EcoRI site lies within the about 
2100 bp BamH I fragment (see FIG. 1) it has been possible 
to clone the 1100 pb BamH I fragment from cloned EcoRI- 
treated HBV-DNA. 

A preparation of HBV-DNA from pEco-63 was 
obtained by specific cleavage to release the HBV-DAN, and 
inserted at the PstI site of pBR325. In this procedure, 
the plasmid pEco63 (3 jig) was first digested with EcoRI 
endonuclease, then treated with DNA ligase, under 
conditions previously described for the respective 
reactions. The resulting mixture of circular pBR325 and 
HBV-DNA is then incubated with PstI endonuclease and 
rejoined using DNA ligase. Both pBR325 and HBV-DNA 
have a single PstI site, so that the entire HBV-DNA can 
be inserted at the PstI site of pBR325. The resulting 
recombinant plasmid was designated pPst-7. 
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EXAMPLE 2 

Identification of virus DNA in a recombinant plasmid 

Recombinant plasraids pEco-3, pEco-63 f 
pBam-132 and pPst-7 were prepared by growing trans- 
5 formed cells and isolating DNA therefrom, and 

separating host cell DNA from recombinant plasmid DNA 
by equilibrium density gradient centrifugation in the 
presence of ethidium bromide. Recombinant plasmid 
DNA was then treated with the restriction endo- 
10 nuclease specific for the respective insertion site. 
The DNA was fractionated by gel electrophoresis and 
analyzed by the method of Southern, E.M. , J. Mol. Biol. 
98 , 503 (1975) . In the Southern method, the DNA is 
first fractionated by agarose gel electrophoresis, 
15 then denatured in situ and transferred directly from 
the gels to nictrocellulose filters. The band pattern 
of the gels is thus replicated on .the nitrocellulose 
filters. Denatured DNA binds to nitrocellulose filters. 
The filter-bound DNA' is identified by hybridization with 
20 32 p-labeled DNA of known origin. In the case of HBV-DNA 
clones, 32 P-labeled DNA from Dane particles was used as 
the hybridization probe. The results are shown in FIG. 2. 
Lanes 1, 2, 3 and 4 represent pEco-3, pEco-63, pBam-132 
and pPst-7, respectively. FIG. 2A (bright lines on 
25 dark field) shows the gel electrophoretic pattern of 

the DNAs prior to hybridization. Two bands are seen in 
each case, visualized by fluorescence staining with 
ethidium bromide. The uppermost band being the linear 
transfer vector DNA, pBR325, in lanes 1, 2 and 4,. and 
30 pBR322 in lane 3, the lower band being the putative 
HBV-DNA. (The smaller DNA fragments migrate downward, 
as the figure is oriented.) Lane A is a standard pre- 
pared from Hindlll-treated bacteriophage DNA. FIG. 2B 
is an auto-radiogram of the nitrocellulose filter after 
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hybridization with 32 P-HBV-DNA. A band of hybridized 

DNA is observed in each case, corresponding with the 

32 

putative HBV-cloned DNA, while very little P— DNA is 
observed hybridized to the plasmid DNA bands. The 
5 P-DNA hybridized to the plasmid was known to be 
slightly contaminated with pBR325, which probably 
accounts for the slight degree of hybridization observed 
with the plasmid bands. In this manner, all clones 
have been tested for identity. The four plasmids 

10 tested were thus shown to carry HBV-DNA. 

Figure 2C shows the results of an independent 
experiment using an independently prepared sample of 
32 P-labeled Dane particle DNA as probe. Lane 1 shows 
pEco63 DNA digested with EcoRI endonuclease, visualized 

15 by ethidium bromide fluorescence staining (bright bands 
on dark field) ; Lane 2 shows hybridization of the DNA 
of lane 1 to 32 P-labeled Dane particle DNA, visualized 
by autoradiography (dark band on light field) ; Lane 3 
shows molecular weight standards prepared by Hindlll 

20 digestion of -x. DNA; Lane 4 shows pBam 132 DNA digested 

with BamH I endonuclease; Lane 5 shows hybridization of 

Lane 4 DNA with 32 P-labeled Dane particle DNA; Lane 6 

shows pPst7 DNA digested with PstI endonuclease; and 

32 

Lane 7 shows hybridization of lane 6 DNA with P- 

25 labeled Dane particle DNA. 

EXAMPLE 3 
Transcription expression 

Transcription expression was demonstrated by 
showing that mRNA isolated from host cells transformed 

30 by a recombinant transfer vector was complementary with 

viral DNA. The experimental method used herein was 

that of Alwine,-J.C. et al., Proc. Nat. Acad. Sci. USA 

74 , 5350 (1977) . In the Alwine et al. method, UNA 

fractionated by gel electrophoresis is transferred 

35 directly to a solid phase support, preserving the gel 

32 

banding pattern. Hybridization to a P-labeled DNA 
probe is carried out on the solid phase support. The 
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method is analogous to the technique described in 
Example 2 but differs in detail because RNA does not 
bind to nitrocellulose filters. In the method of 
Alwine et al. diazobenzyloxymethyl paper filters are 
5 employed to bind RNA transferred from the electro- 
phoresis gel. After binding the RNA, the derivatized 
paper is treated to hydrolyze excess diazo groups to 
prevent non-specific binding of the 32 P-labeled probe. 

The labeled DNA probe used in this example 

32 

10 was cloned pEco-3 or pEco-63 DNA labeled with P 
during growh of the host strain. To eliminate 
hybridization between the pBR325 portion of the labeled 
probe and its mRNA, a 50-fold excess of unlabeled pBR325 
was added to the hybridization mixture. 

15 RNA was isolated from host cells carrying 

either pEco-3, pEco-63, pBam-69, pBam-132, pPst-7 or 
pBR325 grown to raid-log phase in 100 ml batches. Cells 
were collected by centrifugation for 10 minutes at 
6000 rpm in a GSA rotor (DuPont Instruments, Newtown, 

20 Connecticut) . The pellet was resuspended in 2 ml of 
10 mM tris, pH 7.6, 5 raM magnesium acetate and 10 mM 
KCL, then transferred to a tube containing 1 mg lysozyme. 
The cells were then quick-frozen, 0.25 ml sodium dodecyl- 
sulfate 10% (w/v) added, thawed and thoroughly mixed. 

25 Sodium acetate, 1 M, pH 5.2, 0.25 ml, was added with 
mixing. 

The RNA was extracted with water-saturated 
phenol, 2.5 ml, by intermittent mixing at 37 °C. for a 
period of 30 minutes. The aqueous phase was removed 

30 and re-extracted with fresh water- saturated phenol. 

The aqueous phase was then extracted with ether. A 

centrifugation at 5000 rpm for 5 minutes was helpful to 
separate the phases. A gummy material at the interface 
was discarded. RNA was precipitated by addition of a 2/3 
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volume of 5 M NaCl and 2.5 volumes of ethanol, 
incubated overnight at -20°C. The precipitate was 
collected by centrif ugation at 10,000 rpm (HB4 rotor, 
DuPont Instrument Co. , Newtown, Connecticut) for 
5 20 minutes at -20°C, washed once with ethanol and 
then redissolved in 4 ml of 10 niM tris, pH 7.4, 1 mM 
EDTA. The solution was centrif uged at 10,000 rpm in 
the HB4 rotor for 10 minutes at 0°C. , and the pellet 
discarded. To the supernatant solution was added 8 ml 

10 of 4.5 M sodium acetate, pH 6, to precipitate RNA 

preferentially at -20°C. for 8 hours. The precipitate 
was collected by centrifugation at 10,000 rpm for 20 
minutes in the HB4 rotor at -20°C. The foregoing pre- 
cipitation generally removed about 70% of the DNA. The 

15 precipitate was resuspended in 3.5 ml tris, 10 mM, pH 
7.4, 1 mM EDTA, 7 ml sodium acetate and again precipi- 
tated. The filial pellet was resuspended in 0.4 ml tris 
EDTA and stored frozen. 

RNA, prepared as described, was fractionated 

20 by gel electrophoresis for hybridization analysis as 

described by Alwine et al., using 10 \ig RNA per lane. 

The results are shown in FIG. 3 . Figure 3A shows the 

gel electrophoresis results, as visualized by fluorescence 

staining. In every case, two major RNA bands are seen 

25 corresponding to 16S and 23S ribosomal RNA. FIG. 3B 

32 — ——> 
shows an auto-radiogram of P-HBV-DNA from pEco 63, 

10 cpm/^g, capable of hybridizing to RNA in the re- 
spective gels. Lanes 1-6 represent the results with 
RNA extracted from cells infected with the following 

3 0 plasmids: Lane 1, pBam-69; Lane 2, pBR325; Lane 3, 

pPst-7; Lane 4, pEco-63; Lane 5, pEco-3; Lane 6, pBam-132. 
Lanes A and B are standards of purified bacteriophage 
MS-2 RNA and E. coli ribosomal RNA, respectively. 

It can be seen that hybridizable material was 

3 5 found in each case, and that the extent of hybridization 
was signficantly greater in the case of the recombinant 
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plasmids. Furthermore, in comparing the size of 
hybridizable material, it can be seen that the larger 
clones, pEco-63 and pEco-3, gave rise to a wider 
range of RNA sizes and to longer maximal length RNAs 
5 than did the shorter insertions, pBam-69 and pBam-132. 

From the foregoing data it is clear that 
transcription expression of cloned HBV-DNA occurs 
in E. coli . 

' EXAMPLE 4 

10 Nucleotide sequence of HBV-DNA 

The sequence of the entire HBV genome was 
obtained from cloned HBV-DNA carried on plasmids pEco-3, 
pEco-63 or pPst-7 described in Example 1, by the method 
of Maxam, A. and Gilbert, W. , Proc. Nat. Acad. Sci. USA 

15 74, 560 (1977) . The sequence is given in Table 1. 

The sequence is written as a linear sequence beginning 
at the EcoRI cleavage site. The sequences of both 
strands are shown, the upper sequence of each line 
reading from 5' to 3' left to right, the lower (com- 

20 plementary) sequence reading from 3" to 5', left to 
right. The abbreviations used indicate the bases of 
the deoxynucleotide sequence: A for Adenine, T for 
Thymine, G for Guanine and C for Cytosine. 

EXAMPLE 5 

25 on the basis of the nucleotide sequence of 

HBV-DNA, as determined in Example 4, the location of a 
sequence coding for the S protein, an immunologically 
active protein constituent. of HBsAg are known from the work 
work of Peterson, D.L. , et al. (1978), supra . The 

30 smaller BamHI fragment of about 1,100 bp length was found 
to contain a nucleotide sequence coding for a sequence 
similar to the N- terminal 19 amino acids of the protein 
constituent of HBsAg, and also described by Peterson 
coding for the same three C-terminal amino acids, in 

35 phase with the N-terminal sequence and just prior to a 
TAA termination codon- The protein encoded by this 
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squence is 226 amino acids long and has a molecular 
weight of 25,398, in satisfactory agreement with 
the mass (22,000-24,000) determined by sodium 
dodecyl sulfate gel electrophoresis of other protein 
5 constituents of HBsAg isolated by Gerin, J.L. and 

Shi, J.W.K., or by Peterson et al. (1978) supra . The 
226 amino acid protein described herein is designated 
the S protein. For reference purposes, the reading 
frame of the S protein coding sequence is designated 
10 Frame 1. Frames 2 and 3 are shifted forward 1 and 
2 nucleotides, respectively. The relationships are 
illustrated by the following diagram, based on the 
first 9 nucleotides of the S protein coding sequence: 

1 

15 'ATGGAGAAC'. 

T-i — 3 

I 1 1— 2 

The amino acid composition of the S protein, 
20 predicted from the nucleotide sequence, is in very 
close agreement with that reported for the protein 
constituent of HBsAg, described by Peterson et al. 
(1978) supra . However, the N-terminal amino acid 
sequence differs from that previously reported, by 
25 having a leucine residue a position 15, instead of a 
serine. The map location of the S protein coding 
region is shown in FIG. 4 . 

Because of the prevalence of intervening 
sequences in eucaryotic genes, Robertson, M.S., 
30 et al., Nature , 278 , 370 (1979), it is not possible 
to presume the colinearity of a gene with the amino 
acid sequence of the protein product. There is, 
however, no evidence for an intervening sequence in 
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the S protein. gene, since the molecule predicted by 
the DNA sequence closely approximates the characteris- 
tics of an immunologically active constituent of the 
surface antigen. Any intervening sequence (s) would 
5 have to small ( <150 bases) ; most intervening sequences 
in structural genes are longer. The N-terminal and 
C- terminal ends of the molecule are in phase, thus any 
intervening sequence must also maintain the phase • 
Therefore, the conclusion is justified that the 
10 identified S protein coding region is colinear with 
the mRNA. 

The complete amino acid sequence of S protein, 
based on the DNA nucleotide sequence, is -given in 
Table 2. Standard abbreviations used in protein 
15 chemistry are used to denote the amino acids. The 
starting point identified for the S-protein is the 
methionine residue coded by nucleotides 1564-1566 in 
Table 2. As indicated in Fig. 4 and in Table 2, the 
S-protein coding region includes a substantial region 

20 coding for an additional N-terminal sequence of amino 
acids beginning at the methionine coded by nucleotide 
1042-1044 or alternatively the methionines coded by 
nucleotides 1075-1077 or nucleotides 1399-1401. Protein 
encoded by these regions has not been recognized as a 

25 component of HBV. However, such proteins may serve a 
biological function as yet unknown in the infection 
process. Additionally, the proteins initiated from the 
described starting points are useful S-protein deriva- 
tives having N-terminal amino acid sequences coded by 

30 naturally occurring nucleotide sequences, which have 
greater molecular weight and higher antigenicity than 
S-protein itself. These S-peptide analogs are useful 
in eliciting antibodies directed against S-protein, for 
immunization and for assay purposes. 
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There are two Tac I restriction sites located 
near either end of the S -protein coding region* The 
smaller BamHI fragment was treated with Tac I endo- 
nuclease to provide blunt ends. Hind III linkers 
were attached by blunt end ligation to the blunt ends 
of the Tac I fragment [Sugino, A. , et al. , J. Biol . 
Chem. , 252 , 3987 (1977)]. The fragment was then 
inserted into the expression plasmid ptrpE 30 , derived 
from plasmid p trp ED50 [Martial , J., et al. , Science , 
205 , (1979)]. Plasmid p trpE 30 contains the operator, 
promoter attenuator and ribosome binding sequence of 
the tryptophan operon, together with a nucleotide 
sequence coding for seven amino acids of the trp E 
protein followed by a Hind III site in the direction 
of normal translation. This plasmid was used for 
convenience in providing a known reading frame com- 
patible with expression of S-protein, upon insertion at 
the Hind III site. 

The expression plasmid p trp E30 was pretreated 
with Hind III endonuclease. The treated S-protein coding 
fragment was then inserted into the treated plasmid 
by means of DNA ligase catalyzed joining reactions. The 
Hind III site of ptrpE 30 is known from sequence date 
to provide a reading frame in phase with the inserted 
S-protein coding sequence. Transformation of E. coli 
HB101 led to expression of a trp E-S protein fusion 
protein under tryptophan operon control, and inducible 
with 0-indolylacrylic acid, as next described. This 
strain was designated E. coli HBlOl/p trp E30— HBsAg. 

Bacterial cells transformed by p trpE 3 0/HBsAg were 
grown in a standard minimal medium (M9) supplemented 
with leucine, proline, vitamin Bl and ampicillin, at 
37 °C. In early log phase, the trp operon was induced 
by addition of fi-indolylacrylic acid (30 \xg/ml of 
medium) . Control cultures were left uninduced. After 
3 more hours of growth, 1.5 ml of cells were radio- 
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35 

actively labeled by addition of 20 |xCi S-L- 
methionine and incubation for 10 minutes. The cells 
were then collected by centrif ugation, washed and re- 
suspended in 250 \il of buffer containing glycol 10% 
(v/v)', p-mercaptoethanol 5% (v/v) , and SDS 2.3% (w/v) 
in 0.0625M tris pH 6.8. The suspension was boiled 
for 5 minutes, then applied to a 10% (w/v) SDS-poly- 
acrylamide gel and fractionated by electrophoresis. 
The protein bands were visualized by autoradiography. 
The results are shown in FIG. 5 . 

Individual isolates of transformed HBlOlp trp 
E30/HBsAg were designated pl26, pl35, pl46, pl50, pl55 
and pl66, respectively. The proteins, of induced and 
non-induced cultures are shown side by side for com- 
parison, labeled, e.g. pl26ind, or pl26, respectively. 
Standards include cells transformed with p trp E30 
lacking an insert, and a mixture or proteins of known 
size: Bovine serume albumin, ovalbumin, carbonic 
anhydrase and lysozyme, having molecular weights (M.W.) 
of 69,000 ( n 69K n ), 43,000 ("43K") , 30,000 ("30K") and 
14,300 ("14.3K") respectively. 

The expression of the trpE-S protein fusion 
protein was demonstrated by the appearance of bands, 
unique to induced cultures, indicated in FIG. 5 by the 
small arrows, of a protein having a M.W. approximately 
27,000. The calculated M.W. of the trpE -S protein 
fusion product is 27,458. The fusion protein includes 
7 amino acids from the N-terminus of the trp E protein, 
and 12 amino acids coded by the Hindlll linker and the 
nucleotides lying between the TacI site and the start 
of the S -protein coding region. The amino acid 
sequence of the fustion protein is : Met-Gln-Thr-Gln- 
Lys-Pro-Thr-Pro-Ser-Leu-Ala-Arg-Thr-Gly-Asp-Pro-Val-Thr 
Asn-S, where S stands for the amino acid sequence of 
the S-protein. 
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Expression of the S-protein coding region 
was detected by its immunochemical reactivity with 
antibody to HBsAg, in a competitive radioimmune assay 
with labeled HBsAg, in a competitive radioimmune 
5 assay with labeled HBsAg (AUSRIA, trademark Abbott 
Laboratories/ North Chicago, 111.). Expression is 
also detected by immunoprecipitation. A culture of 
E. coli HBlOl/p trp E30 -HBsAg is induced with fJ-indolyl 

acrylic acid, and 3 ml samples pulse labeled with 

14 35 
10 2 fiCi of C-labeled amino acids or S-methionine for 

a constant time, at various intervals after induction. 

Samples from the zero and 4 hour-induced cultures are 

immunoprecipitated after reaction with antibody to 

HBsAg, using formaldehyde treated Staphylococcus 

15 aureus to collect the antigen-antibody complexes, as 
described by Martial, J .A., et al., Proc. Nat . Acad . 
Sci. USA, 74 , 1816 (1977) . The precipitated proteins 
are solubilized and fractionated by electrophoresis in 
SDS poly aery 1 amide gels. The results show that immuno- 

20 precipitatable protein appears in substantial amount 

only after induction, confirming the expression of the 
S-protein coding region under tryptophan operon 
control, and confirming the immunological reactivity 
of S-protein with antibodies to HBsAg. 

25 The expression of S-protein by individual 

bacterial colonies is detected by a modification of 
the polyvinyl disk method of Broome, S. and Gilber, W. , 
Proc, Nat. Acad. Sci. USA , 75 , 3727 (1978) , a disk of 

30 polyvinyl that has been washed thoroughly is floated 

on a solution of unlabeled IgG (in this case comprising 
antibody to HBsAg) at a concentration of 10-60 p.g/ml 
in 0.2 M NaHC0 3 , pH 9.2 for 3 minutes. The disk is 
then washed 2 times in wash buffer (10 mg/ml gelatin, 

35 1% serum (human, rabbit or guinea pig) 0.1% NP40, 

0.02% NaN- in phosphat -buffered saline). The disk is 
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then applied to an agar plate containing either 
lysed bacterial colonies. or small liquid samples that 
have absorbed into the agar. The lysis of bacterial 
colonies can be achieved in any one of three ways: 
5 1) exposure to CHC1 3 in a desiccator for 

10-20 minutes , 

2) transfer of bacterial colonies to an 
agar plate containing lysozyme, EDTA and Tris-HCl 
PH 9, 

10 3) overlay the agar plate containing 

colonies with a lysozyme , EDTA, Tris-HCl, 10% wash 
buffer and 1% agarose solution. After the overlay 
solidifies, the coated polyvinyl disk can be applied 
directly. 

15 All three methods appear to possess similar 

sensitivity. The overlay technique has the advantage 

of being able to recover bacteria from positive 

colonies after the lysis procedure. After a 1-4 

hour incubation at 4°C the polyvinyl disk is again 

20 washed 2 times in wash buffer. The polyvinyl disk 

125 

is now incubated with 2 ml of I-IgG (anti-HBsAg) 
in wash buffer (2 X 10 6 cpm/ml) overnight at 4°C. The 
polyvinyl disk is washed 2 times at 42°C in wash 
buffer for 15 minutes apiece, then washed 2 times in 

25 distilled water at room temperature. The disk is 

then exposed to x-ray film at -70°C for 18-48 hours. 
Areas that possess antigen appear as dark spots on the 
developed x-ray film. Colonies that possess antigen 
are identified as expressing the S-protein coding 

30 region. Cultures are grown from selected colonies for 

the purpose of producing the S-protein on a large scale. 
The trp E-S protein fusion product is purified from 
(cell) lysates by conventional means, including gel 
filtration and affinity chromatography. 
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EXAMPLE 6 
Bacterial Synthesis of S-Protein 

The expression product of Example 5 is a 
fusion protein comprising S-protein and a 19 amino 
5 acid N-terminal sequence derived from the trp E 

protein (first 7 amino acids from the N-terminus) , 
the Hind lll linker (next 3 amino acids) and that 
portion of the HBV genome between the TacI site and 
the methionine initiating the S -protein (9 amino 
10 acids) . For many applications, including vaccination 
of humans , it is preferred to achieve synthesis of 
S-protein itself, or one of its naturally coded 
derivatives, as shown in Table 2. It is technically 
feasible to remove the nineteen amino acid N-terminal 
15 sequence by limited treatment with an exopeptidase 
(aminopeptidase) , however, the yield of S-protein 
would be expected to be low* 

Expression of S-protein per se can be 
accomplished by modifying both the expression plasmid 
20 and the S-protein coding fragment, to remove from the 
former the nucleotides coding for the host portion 
of the fusion protein, and to remove from the latter 
any nucleotide preceding the start codon of the 
S-protein structural gene. Any expression plasmid 
25 may be employed, preferably one having an insertion 
site close to the beginning of translation, such as 
ptrp E30 or pBH20 (Itakura, et al., Science 198 , 1056 
(1977>. 

Treatment to remove short nucleotide segments 
30 is accomplished using exonucleo lytic enzymes. A pre- 
ferred enzyme is T4 polymerase, which, in the absence 
of added deoxynucleotisde triphosphates, catalyzes 
3* to 5* exonucleolytic digestion of double- stranded 
DNA, Englund, P.T. , J. Biol . Chem. 246 , 3269 (1971). 
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The extent of digestion is controlled by selection 
of proper temperature, reaction time and amount of 
enzyme, according to principles well known in the 
art. Experimentation will be necessary in each 
5 instance, since optional reaction conditions must be 
determined for each lot of enzyme and for each DNA 
to be modified. By these means, the extent of di- 
gestion can be controlled. Termination of digestion 
at a predetermined stopping point is achieved by 

10 including a single deoxynucleoside triphosphate in 
the reaction mixture, corresponding to the desired 
stopping point. For example , in the presence of 
dATP, the DNA is digested 3' -5' until the polymerase 
reaches a dA residue, at which point further net 

15 digestion ceases. Several cycles of digestion, each 
with its predetermined stopping point, can be carried 
out in sequence, to construct DNA molecules having a 
predetermined end points Exonucleolytic digestion 
with T4 polymerase affects only the strands having 3 1 

20 termini. The complementary strands remain as unpaired . 
single stranded tails, which must be also removed. SI 
nuclease is the preferred enzyme for the purpose. The 
product of combined treatment with T4 polymerase and SI 
nuclease is blunt-ended, double-stranded DNA. 

25 The above-described treatment can be used to 

treat an existing expression plasmid to remove the 
nucleotides coding for the host portion of the fusion 
protein. The essential elements to be preserved are 
termed the expression unit. The expression unit includes 

30 a promoter and a ribosomal binding site capable of 

acting in the host organism. As a practical matter, it 
is not necessary to remove precisely the nucleotides 
coding for the host portion of the fusion protein. The 
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relationship between the ribosomal binding site and 
the start codon (AUG) is such that the start codon 
may be located anywhere within 3 to 11 nucleotides 
of the ribosomal binding site, Shine et al., Proc . 
5 Nat, Acad. Sci. USA , 71 , 1342 (1974); Steitz, J., 

et al., Proc. Nat. Acad. Sci. USA , 72 , 4734 (1975). 
In this 3-11 nucleotide region, the first AUG to be 
encountered sets the reading frame for translation. 
In the case of p trp E30, described in Example 5, 

10 the removal of a minimum of 23-29 nucleotides from 
the Hind i II site provides a site for insertion into 
an expression unit under tryptophan operon control. 

The digestion of p trp E30 by Hindlll endo- 
nuclease is carried out under conditions essentially 

15 as described in Example 1 for cleavage of plasmid 

DNA with restriction enzymes. The treated DNA is re- 
covered from the reaction mixture by two cycles of 
ethanol precipitation. In one optimized T4 polymerase 
digestion reaction, 15 p.g of DNA is resuspended in 

20 H 2° and a solution of concentrated salts is added to 
provide a reaction mixture containing 70 mM Tris 
pH 8.8, 70 mM MgCl 2 , 10 mM dithiothreitol and 13.75 
units of T4 polymerase (P-L Biochemicals, Milwaukee, 
Wis,) in a total volume of 250 ^1. The reaction 

25 mixture is incubated 3.3 minutes at 37°C. The 

reaction is terminated by rapidly transferring the 
incubation mixture to an ice bath, then inactivating 
the enzyme by 5-minute heat treatment at 65 °C. The 
DNA is recovered by ethanol precipitation. SI nuclease 

30 treatment is carried out as described by Ullrich, A. , 
et al . , supra ♦ 

In similar fashion, the Tac I fragment of 
HBV-DNA comprising the S-protein coding region, des- 
cribed in Example 5, is treated with T4 polymerase to 

35 remove approximately 30 deoxynucleotides from each 3* 
end. BamHI linkers are added by blunt end ligation. 
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The linkers have the sequence 5 ■ -CCGGATCCGG-3 • on 
one strand and its complementary sequence on the 
other. Treatment with Hpal l exonuclease, which 
cleaves the sequence CCGG to yield CGG, yields a 
DNA fragment which may be joined to any site having 
a 5* -terminal CG, for example Hpal cut DNA or Clal 
cut DNA. A partial restriction may of the Tac 1 
fragment is: 
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The Tad fragment, treated as described, 
is readily inserted into p trp E30, also treated as 
described, and similarly provided with a Hjoall - 
specific linker, in a DNA ligase catalyzed reaction 
5 as described by Valenzuela, et al., Nature, 28(D, 
815 (1979). Bacterial cells are transformed with 
the insert-bearing plasmid. Transformants are 
selected by resistance to ampicillin as described in 
Example 5. Cultures grown from single-colony isolates 

10 are induced with 0-indolylacrylic acid, and pulse- 
labeled with 35 S-methionine as described in Example 5. 
The labeled proteins are visualized by gel electro- 
phoresis and autoradiography. The clones yielding 
protein bands in the 27,000 M.W. region are highly 

15 likely to be synthesizing S-protein, without a leader 
sequence . 

If removal of the host protein coding region 
of the vector DNA is incomplete, there is a 1/6 chance 
that the inserted DNA will be expressed as a fusion 

20 protein. However, if too many nucleotides are removed 
from the vector DNA, it is probable that no protein 
will be formed coded by the insert DNA, while if the 
treated insert is too long, such that more than 11 
nucleotides separate the ribosomal binding site from 

25 the start codon, little or no protein will be formed. 
Only if the vector retains part of its coding sequence, 
or the insert treatment has removed part of the S- 
protein coding region, will there be any possibility 
of incorrect protein synthesis. Therefore, identity of 

30 the protein made by a given clone is obtained by end 
group analysis, for example, by Edman degradation, to 
confirm the N-terminal sequence Met-Glu-Asn-Ile of 
S-protein. The correct plasmid construction is con- 
firmed by DNA base sequence analysis (Example 4) . Proof 

35 of structure of the expressed S-protein is accomplished 
by complete amino acid sequence analysis. True S- 
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protein, synthesized by a bacterial strain, is 
purified by standard methods, such as gel filtration 
and affinity chromatography/ and further characteriz- 
ed by immunochemical tests and tryptic digest analysis. 

Purified S-protein is immunogenic and 
cross -reactive with antibody to HBsAg. The amino 
acid sequence, determined by the base sequence of the 
S-protein coding region is as follows: 



40 O 



®GluAsnIleThrSerGlyPiieLeuGlj'ProLeuLeuValI.euGlnA.la 
GlyPhePheLeuleuThrArglleL euThr IlePr oGlnS erleuAsp 
SerTrpTrpThrSerLeviAsiiPheLeuGlyG-lySerProValCysLeu 
GlyGlnA-snSerGlnSerProThrSerAsiaEisSerProThrSerCys 
ProProIleCysProGlyTyrArgTrpMetCysLeuArgArgPlielle 
IlePh-eLevLPlielleLeuLeuLeuCysLeiillePlieLeuIieuValLeu 
LeuAspTyrGlnGlyHetLeviProValCysProLeTalleProGlySer 
ThrThrThrSerThrGlyProCysLysThrCysThrTlirProAlaGln 
GlyAsnSerMetPheProSerCysCysCysThrLysProThrAspGly 
AsnCysThrCysIleProIleProSerSerTrpAlaPheAlaLysTyr 
Leu!DrpGlix!PrpAlaSerValArgPh.eSerTrpLeuSerIieTaI.euVal 
ProPheValGlnTrpPheTalGlyLeuSerProThrValTrpLeuSer 
AlalleTrpMetHetTrpTyrTrpGlyProSerLeuTrySerlleVal 
SerProPhelleProLeuLeuProIlePhePheCysLeuIDrpVaiaiyrlle 
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Adaptation of the desired techniques in 

combination with methods known in the art make it 

feasible to construct a family of S-protein 

derivatives of the general formula 
5 O 

it 

X-NH-S-C-Y 

wherein S is the amino acid sequence of the S-protein, 
X is an amino acid, peptide, protein or amino pro- 
tecting group, including but not limited to the 

10 naturally coded amino acid sequences shown in Table 2, 
and also including peptides composed primarily of 
aromatic amino acids such as tyrosine, phenylalanine 
and tryptophan, said peptides being less than about 4 
amino acid residues in length, as described by Sela, M. ,^ 

15 Science , 166 1365 (1969) and Sela, M. , Cold Spring 
Harbor Symposium on Quantitative Biology, Vol. 32 
(1967) , having the property of increasing the anti- 
genicity of proteins to which they are attached, and Y 
is an amino acid, peptide, protein or carboxyl pro^ 

20 tecting group in ester or amide linkage, including 
but not limited to the peptides composed of aromatic 
amino acids already mentioned. The S-protein has a 
molecular weight of 25,398. The derivatives will 
therefore have molecular weights greater than 25,398. 

25 The described S-protein derivatives have enhanced 

antigenicity and stability to proteolytic digestion. 
The derivatives are therefore useful as antigens for 
vaccination and for assay purposes. 

Various amino protecting groups known in the 

30 art are suitable for use in m&king derivatives of the 
S-protein and peptide derivatives thereof. The choice 
of a suitable amino protecting group depends upon such 
factors as the nature of the amino acid to be protected, 
relative ease of removal, convenient reaction conditions 
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such as solvent, temperature, etc. Suitable amino 
protecting groups include the benzyloxycarbonyl 
(carbobenzoxy) group, substituted carbobenzoxy or 
other urethane protecting groups, the trifluoro 
5 acetyl group, the phthalyl (or phthaloyl) group, the 
diphenylmethyl (benzhydryl) group, the triphenyl- 
methyl (trityl) group, the formyl group, lactams, 
Schiff bases and N-amines, the benzylsulfonyl group, 
the trityl sulf enyl group and the aryl sulf enyl 

10 group. Commonly used amino protecting groups include 
the tertbutyloxycarbonyl group, the o-nitrophenyl 
sulf enyl group and the tosyl group. Reference is 
made to standard works on peptide chemistry such as 
Bodanszky, O., et al.. Peptide Synthesis , CH. 4, 

15 Inter science Publ. (1966) ; Schroeder, The Peptides , 
Vol. 1, pp. xxiii-xxix, Academic Press (1965); and 
Protective Groups in Organic Chemistry (J.P.W. McOmie, 
ed.) Plenum Press (1973). 

Suitable carboxyl protecting groups known in 

20 the art include lower alkyl esters, phenyl- substituted 
lower alkyl esters, e.g., benzyl and benzhydryl 
esters, p-nitro benzyl esters, p-methoxybenzyl esters, 
phthalimido-methyl esters, t-butyl esters, cyclopentyl 
esters, methyl thioethyl esters, trimethyl silyl 
25 groups, and hydrazides. The choice of particular groups 
depends upon such variables as previously noted for 
choice of amino protecting groups . Commonly used 
carboxyl protecting groups are methyl, ethyl, propyl, 
t-butyl and benzyl. 
30 Other functional groups, such as -OH and 

guanidino groups, may be protected by known methods, 
if desired. 
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Synthesis of the described S-protein de- 
rivatives is accomplished as described by Sela, 
et al., supra , or by modifications of the recombi- 
nant DNA techniques described in Examples 1-6, 
making use of appropriate restriction sites for 
cleavage of the DNA near the desired starting point, 
and selectively removing short end segments using 
T4 polymerase. In cases where restriction endo- 
nuclease cleavage yields a shorter product than de- 
sired, the desired deoxy nucleotide sequence can be 
provided by chemical synthesis. (See, e.g., Goeddel, 
D., et al., Nature , 281 , 554 (1979). The scope of 
possible S-protein derivatives is not limited to 
those peptides of the naturally coded sequence that 
are initiated with a methionine residue, but includes 
all possible subsequences of the naturally coded 
sequence shown in Table 2. 

In addition, glycosylated derivatives of 
the S-protein are antigenic and are useful for pro- 
duction of antibodies. The expected glycosylation 
sites are asparagine residues in the subsequences 
-Asn-M-(Ser) or (Thr)-, where M is any amino acid. 
There are three such sites, at amino acid positions 
3, 59 and 146 of the S-protein. In addition, there 
are two such sites within the naturally coded sequence 
providing useful S-peptide derivatives, thereby 
providing for glycosylated derivatives as well. 

EXAMPLE 7 
In Vitro Synthesis of S-Protein 

The expression of the S-protein coding 
region is carried out in vitro using the DNA-directed 
protein synthesis system described by Zubay, G. , 
supra . The DNA used in the synthesis is either the 
recombinant plasmid p trp E30/HBsAg or the modified 
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recombinant plasmid described in Example 6 for 
expression of S-protein. In addition, restriction 
endonuclease cut fragments of HBV-DNA, such as the 
Tac I fragment including the S-protein coding region, 
may be employed in the Zubay system* One or more 
of the amino acids provided in the system is radio- 
actively labeled, in order to permit a sensitive 
assay for the product protein.. Synthesis of S- 
proteins is detected by the binding of radioactively 
labeled material to anti-HBsAg antibody or anti-S- 
protein antibody, in any of the assay systems pre- 
viously described. 

EXAMPLE S 

The HBV-DNA and restriction fragments thereof 
are cloned in a bacteriophage transfer vector. For 
this purpose, the phage X ChlSA is suitable, Blattner, 
F.R. , et al., Science , 196 , 161 (1977). The phage 
contains a single EcoR I site, located in a lac 5 sub- 
stitution. Insertion into the lac 5 region provides a 
useful selection technique: when the chromogenic 
substrate 5-chloro-4-bromo-3-indolyl-3-D-galactoside 
(XG) is included in the plating medium, >^ Chl6A 
gives vivid blue plaques while 7\Chl6A bearing an 
insert in the EcoR I site gives colorless plaques when 
plated on a Lac" bacterial host. Furthermore, the 
EcoRI site provides an insertion locus near a 
functional operator-provided region, suitable for 
expression of coding regions as fusion proteins bearing 
N-terminal portion of the g-galactosidase gene. 

EXAMPLE 9 

Identification of core antigen coding region 

The HBV-DNA nucleotide sequence read in phase 
2 provides an open region of 666 pb length bounded by 
a termination codon (TAG) and an initiation codon 
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(ATG) . An open region is one containing no termi- 
nation codons in phase. The 555 bp region is the 
largest such open region in phase two of the HBV 
genome. An initiation sequence, TATACAAG, was 
5 observed prior to the ATG start codon, beginning 
at position 93 consistent with the conclusion that 
the region is a coding region for a protein. (See 
E.B. Ziff, et al. Cell , 15, 1463 (1978), and F. 
Gannon, et al., Nature 278 , 428 (1979). The 

10 molecular weight of the encoded protein is 21,335, 
consistent with the estimated M.W. of 21,000 derived 
from gel electrophoresis, (See also Gerin, J.L. 
and Shi, J.W.K., supra. . 

Significantly, the amino acid sequence of the 

15 encoded protein includes an extensive region of 
predominantly basic amino acids in the C-terminal 
region of the protein. The encoded protein will 
• therefore bind tightly to DNA, in a manner similar to 
a protamine, and consistent with the behavior expected 

20 for the core protein of a virus. 

The encoded protein has been further 
identified as HBcAg by the existence of a single 
internal methionine residue. Cleavage of the encoded 
protein at this methionine residue would yield two 

25 fragments having about 35% and 65%, by weight, of 
the intact protein. Cleavage of isolated HBcAg by 
CNBr yields fragments of approximately 40% and 60%, by 
weight, of the intact protein, within experimental 
error of the predicted sizes (J. L. Gerin and J.W.K. 

30 Shi, personal communication) . 

On the basis of the predicted M.W. , amino 
acid sequence consistent with known functional pro- 
perties, and presence of a correctly placed internal 
methionine residue, the coding sequence for HBcAg has 

35 been identified. The predicted amino acid sequence of 
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HBcAg is given in Table 2 and the map location oh 
the HBV genome is shonw in FIG. 4 . The map in 
fig. 4 shows a possible alternative start codon at 
position 2, which could provide an earlier initiation 
5 point and a somewhat longer amino acid sequence. The 
likelihood that the earlier start codon is actually 
utilized in vivo is reduced by the fact that the ATG 
codon at position 93 is preceded by an 18S ribosome 
binding site sequence, whereas no such sequence pre- 
10 cedes the alternative start codon at position 2. 

The expression of HBcAg in E. coli is obtained 
by conventional insertion of a restriction fragment 
containing the core antigen coding region into an 
expressed bacterial operon located in a transfer 
15 vector, in correct reading frame and orientation. 

Selection of the plasmid of choice is based upon con- 
siderations of operating convenience and yield. For 
example, insertions in the tryptophan operon are capable 
of providing high yields of expression product, as 
20 shown in Example 5. Insertions in the P-lactamase 

operon of pBR322 provide a protein that may be extracted 
from the periplasmic region of the cell, for greater 
ease of purification, and may prevent death of the host 
cell should the expression product be toxic. Given the 
25 known reading frame for the HBcAg gene, an expression 
plasmid having an insertion site in the correct reading 
frame is selected. Alternatively, the end to be 
inserted proximally to the operon is tailored by 
selective removal or addition of 1-2 nucleotides, using 
30 known techniques, to provide correct phasing of the 
reading frames of the operon and the insert. 
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EXAMPLE 10 

Identification of additional proteins coded 
by HBV-DNA was facilitated by analysis of the nucleo- 
tide sequence. The distribution of termination codons 
in reading frame number 3 indicates an open region 
capable of coding for a large protein of molecular wie 
weight up to 95, 000 , hereinafter protein "A". The 
probable initiation site was identified as an ATG 
codon beginning at position 494. This start codon is 
preceded by two possible initiation sequences, a 
TATAAAG sequence* beginning at position 104, and a TATAT 
sequence beginning at position 400. The amino acid 
sequence of protein A, and its position in the HBV-DNA 
nucleotide sequence are shown in Table 2 and in FIG. 4 . 

Gel electrophoresis of a Dane particle pre- 
paration in sodium dodecyl sulfate revealed a prominent 
band of protein having a M.W. of about 80,000, consis- 
tent with the hypothesis that the protein band is com- 
posed of protein A. It is possible that protein A is 
the DNA polymerase associated with Dane particles. 

A small protein, "protein B" , was identified 
in reading frame 2, as shown £n Table 2, and FIG. 4 . 
It is noted that the number of nucleotides in the HBV 
genome is not evenly divisible by 3. By continuous 
tracking of the genome, triplet by triplet, one 
eventually encounters all possible triplets in all 
possible reading frames, in three circuits of the genome. 
In the case of protein B, there exists a possible over- 
lap region in which the sequence coding for the 
C-terminal end of protein B also codes for that part 
of the "possible N- terminal" core gene region shown 
in FIG. 4 , in a different reading frame. 
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The major identified coding regions of 
HBV-DNA were found to be transcribed in the same 
reading direction, hence from the same strand. The 
complementary strand sequence was found to have 
5 numerous termination codons in all reading frames. 
Two possible coding regions for small proteins of 
90 and 60 amino acids were located, the largest of 
which is mapped in FIG. 4 . 

EXAMPLE 11 

10 Antibody Formation in Experiment al Animals 

The trp E-S protein fusion protein described 
in Example 5 and the S-protein described in Example 6 
are sufficiently antigenic to elicit antibodies. The 
antibodies are cross-reactive with HBsAg. Guinea pigs 

15 are injected subcutaneous ly at 9, 14, and 56 day 

intervals with 10 ml physiological saline or phosphate- 
buffered saline containing 500 \ig S-protein or tr£ 
E-S protein fusion product, as described «in Examples 
5 and 6, respectively, purified as described. The 

20 serum of the test animals is samples at 0, 28, 56 and 
84 days and assayed for antibody titer against Dane 
particles or HBsAg partially purified from infectious 
serum. The radioimmunoassay of Hollingren, F. , et al. , 
supra , is employed. The majority of animals exhibit 
25 antibodies cross-reactive with HBsAg 84 days after 
administration of the protein. Similar results are 
obtained upon injection of monkeys. Accordingly, the 
immunologically active protein constituents of HBV, 
expressed by a microorganism that has been transferred 
30 by a DNA transfer vector encoding said protein are 

capable of eliciting antibodies cross-reactive with an 
immunologically reactive component of the virus. 
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The described proteins have the advantage 
of being available in significantly larger quantities 
than HBsAg obtained from Dane particles or carrier 
serum. Furthermore, there is no danger of accidental 
5 infection since there is no intact virus in the trp 
E-S protein expression product , nor in the S-protein. 
By contrast/ viral proteins purified from serum 
always pose the danger of viral contamination. 

EXAMPLE 12 

10 As shown in Example 11, protein coded by the 

genome of an NP virus and synthesized by a micro- 
organism is capable of' eliciting antibodies cross- 
reactive with an immunologically reactive component of 
said NP virus. Furthermore, derivatives and fusion 

15 protein products of such microorganism synthesized pro- 
teins are antigenic and capable of eliciting antibodies 
cross-reactive with an immunologically reactive 
component of the NP virus. It therefore follows that 
such proteins and protein derivatives, when purified as 

20 described and administered in a physiologically 

acceptable medium, constitute a vaccine for protection 
against infection by the virus. 

Sixteen chimpanzees are divided into three 
groups. Group A (6 animals) is inoculated intravenously 

25 with 1.0 ml of B.O.B. Heptatitis B virus; Group B 
(4 animals) is inoculated intravenously with 1.0 ml 
containing 5 mg of trp E-S protein fusion protein, 
synthesized and purified as described in Example 5, 
in physiological saline; Group C (6 animals) is the 

30 control group and receives no inoculation. All 
chimpanzees in Group A have evidence of clinical 
hepatitis B (either antigenemia, enzyme elevations and/or 
antibody response) within forty weeks. None of the 
animals in Groups B or C show evidence of clinical 

35 hepatitis B infection over the same 40-week period. 

The chimpanzees of Group B are rendered immune to sub- 
sequent challenge when inoculated intravenously with 1.0 
ml of B.O.B. hepatitis B virus. 
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The S protein or a derivative thereof, as 
described in Example 6, may be employed in a 
similar fashion to provide the desired immunological 
response. 

While the invention has been described in 
connection with specific embodiments thereof, it will 
be understood that it is capable of further modifications 
and this application is intended to cover any variations, 
uses, or adaptations of the invention following, in 
general, the principles of the invention and including 
such departures from the present disclosure as come 
within known or customary practice within the art to 
which the invention pertains and as may be applied to the 
essential features hereinbefore set forth, and as 
follows in the scope of the appended claims. 
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WE CLAIM ; 

1. A DNA transfer vector comprising at 
least a portion of the genome of an NP-virus . 

2. A DNA transfer vector as in claim 1 
* wherein said transfer vector is adapted to trans- 
form a host cell wherein the transfer vector is re- 
plicated and at least a .part of the cloned genome 
portion is expressed by transcription. 

3. The transfer vector of claim 2 wherein 
at least a part of the cloned genome portion is ex- 
pressed by translation, 

4. The transfer vector of claim 1 wherein 
the NP-virus is Hepatitis B Virus. 

5. The transfer vector of claim 2 wherein 
the vector is a plasmid and the host is a bacterium. 

6. The transfer vector of claim 4 wherein 
the transfer vector is a plasmid and the host is 
Escherichia coli . 

7. The transfer vector of claim 6 wherein 
the vector is pEco-63 and the host is E. coli HB-101. 

8. A method for maintaining, replicating, 
and expressing at least a portion of the genome of an 
NP-virus comprising, 

isolating the genetic material comprising at 
least a portion of the genome of the NP-virus, or a 
cDNA transcript thereof, 

recombining said genetic material with a 
DNA transfer vector, forming a recombinant transfer 
vector , 

transforming a host cell with the recombinant 
transfer vector, 

selecting a host cell strain capable of 
maintaining, replicating, and expressing the recom- 
binant transfer vector, and 
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growing the selected host cell under 
conditions favoring its proliferation, thereby main- 
taining, replicating, and expressing at least a 
portion of the genome of the NP-virus carried therein. 

9. The method of claim 8 wherein the NP- 
virus is Hepatitis B Virus. 

10. The method of claim 9 wherein the trans- 
fer vector is a plasmid and the host is Escherichia 

coli . . . 

11 A vaccine against an NP-virus comprising 

a sterile, physiologically acceptable diluent and an 
antigen comprising an immunologically active protein 
constituent of the virus, expressed by a microorganism 
that has been transformed by a DNA transfer vector 
15 comprising a nucleotide sequence encoding said protein, 
said transformed microorganism being capable of ex- 
pressing said nucleotide sequence.^ 

12. A vaccine according to claim 11 wherein 
the NP-virus is Hepatitis B virus. 

13. A vaccine according to claim 11 wherein 
the protein comprises an immunologically active pro- 
tein constituent of the surface antigen of Hepatitis 
B Virus. 

14. A vaccine according to claim 13 wherein 
25 the protein comprises the S-protein. 

15. A vaccine according to claim 14 wherein 
the S-protein comprises additionally the N-terminal 
amino acid sequence; Met-Gln-Thr-Gln-Lys-Pro-Thr-Pro- 
Ser-Leu-Ala-Arg-Thr-Gly-Asp-Pro-Val-Thr-Asn- . 

16. A method of making a vaccine against an 
NP-virus comprising the steps of 

a. transforming a microorganism with a DNA 
transfer vector comprising a nucleotide sequence en- 
coding a protein of the virus , said nucleotide sequence 
35 being inserted in a region of the transfer vector con- 
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trolled by an expres sable operon, in reading frame 
phase and orientation such that ranslation ex- 
pression of said operon results in translation 
expression of said nucleotide sequence , 
5 b. growing said microorganism under growth 

conditions that allow expression of said operon, thereby 
making a protein comprising the amino acid sequence 
of said protein of the virus f 

C; purifying the protein made in step b, and 
10 d. mixing the purified protein with a 

sterile, physiologically acceptable diluent, thereby 
making a vaccine against an NP virus. 

17. The method of claim 16 wherein the NP 
virus is Hepatitis B virus. 
15 18. The method of claim 16 wherein the 

protein of the virus comprises an immunologically active 
protein constituent of the surface antigen of Hepatitis 
B virus. 

19. The method of claim 18 wherein the 
20 protein comprises the S-protein. 

20. The method of claim 19 wherein the 
protein comprising the S-protein comprises additionally 
the N- terminal amino acid sequence: Met-Gln-Thr-Gln-Lys- 
Pro-Thr-Pro-Ser-Leu-Ala-Arg-Thr-Gly-Asp-Pro-Val-Thr-Asn- . 

25 21. An antigenic protein coded by the genome 

of an NP virus, synthesized by a microorganism and 
capable of eliciting antibodies cross-reactive with an 
immunologically reactive component of said NP virus. 

22. The protein of claim 21, coded by the 
30 genome of Hepatitis B virus. 

23. The protein of claim 22 comprising the 
amino acid sequence of the S-protein of Hepatitis B 
virus . 
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24. A microorganism containing and re- 
plicating a DNA transfer vector comprising at least a 
portion of the HBV genome. 

25. A microorganism according to claim 24 
comprising the bacterial strain Escherichia cpli 

HBl01pEco63. 

26. A protein comprising the amino acid 

sequence : 
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©GluAsnlleThrSerGlyPlieieuGlyProLevLLeuValLeuGlaAla 
G-lyPhePheLeuLeuThrArglleLexilEhrlleProG'lnSerLeuAsp 
Ser3?rpTrpThrSerLeuAsnPheLeu&lyGlySerProValCysIieu 
GlyGlnAsnSerG-lnSerProThrSerAsnHisSerProThrSerCys 
ProProIleCysProGlyTyrArgTrpMetCysLeuArgArgPhelle 
IlePheLeuPhelleLeuLeuLeuCysLetillePlieLeuIieuVallieu 
LeviA.spTyrGlnGlyMetLeuProValCysProLeuIleProGlySer 
ThrThrThrSerThrGlyProCysLysThrCysTlirThrProAlaGlzi 
GlyAsnSerHetPheProSerCysCysCysTlvrLysProThrAspGly 
AsnCysThrCysIleProIleProSerSerTrpAlaPheAlaLysTyr 
LeuTrpGluTrpAlaSerValArgPheSerTrpLeuSerLeuLeuVal 
ProPheYalGlnTrpPkeValGlyLeuSerProaJhrValTrpLeuSer 
AlalleTrpMetMe-bTrpTyrTrpGlyProSerLeuTrySerlleVal 
SerProPiielleProLeviI.euProIlePhePheCysLeTiIrpV'alTyrlle 
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27. A protein having the formula 

H • 0 
i w 

X-N-S-C-Y 

wherein S is the amino acid sequence of the S protein 

X is H or an amino protecting group or an 
amino acid, or a peptide comprising at least the 
C-terminal portion of the amino acid sequence : 

® GlyGlyTrpSerSerLysProArgLysGly © GlyThrAsnLeuSer 
ValProAsnProLeuGlyPhePheProAspHisGlnLeuAspProAla 
PheGlyAlaA-snSerAsnAsnProAspTrpAspPheAsnProValLys 
AspAspTrpProAlaAlaAsnGlnValGlyValGlyAlaPheGlyPro 
ArgLeuThrProProHisGlyGlylleLeuGlyTrpSerProGlnAla 
GlnGlylleieuThrThrValSerThrlleProProProAlaSerThr 
AsnArgGlnSerGlyArgGlnProThrProIleSerProProLeuArg 
AspSerHisPr oGlnAla® GlnTrpAsnSerThrAlaPheEisGln 
ThrLeuGlnAspProArgValArgGlyieuTyrLeuProAlaGlyGly 
SerSerSerGlyThrValAsnProAlaProAsnlleAlaSerHisIle 
SerSerlleSerAlaArgThrGlyAspProValThr 
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or a peptide of less than about 4 amino acids 
length comprising, in random sequence, amino acids 
selected from the group tyrosine, phenylalanine 
and tryptophan, and 
5 y is OH, or a carboxyl protecting group, 

or an amino acid or a peptide of less than about 4 
amino acids length comprising, in random sequence, 
amino acids selected from the group tyrosine, phenyl- 
alanine and tryptophan. 
10 28. A composition containing a protein 

according to claim 27 in a physiologically acceptable 
medium, 

29. A protein according to claim 27 having 
a molecular weight- greater than 25,398. 
15 30. A protein according to claim 27, 

wherein at least one asparagine residue in the amino 
acid subsequence: -Asn-M- (Ser) or (Thr) - is glycosylated, 
wherein M is any amino acid. 
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TABLE 1 

Nucleotide Sequence of HBV-DNA, Beginning at the 
Eco RI site (Nucleotide 1406 of Table 2) 

Total base pairs =3221 



o p 


a u 


t- < 


H < 


P U 


u p • 




. H < 


< H 


H < 


< 


P U 


< H 


< H 


a a 




H 5 


p a 


a a 




p p 


<r V« 


CD P 


< t- , 


LD P 


H < 


a p 


< t- 


a ld 


P P 


< t- 


<?: h 


U LD 


f- < 


u cd 


H < 




H < 


U LD 


a cd 


CD U 


h < 


< H- 


a ld 




H < 




u a 


H < . 


P CD 


U CD 






« H 




H <t 


a p 


a ld 


< H 


< H 


P P 


U P 


H < 


LD U 


ld a 


P U 


a p 


< H 


< h- 


a u 


P U 


«. 


U LD 


< H * 


LD p 


P U 


t- < 




a p 


p a 


< P- 


<: h 


< H 


< h 


o a 


p a 


H < 


a U 


ld a 


<C 6- 


h <: 










a u 


<C H 


a u 




u a 


< 


P G 


P P 




H < 


p a 


U P 


H < 




H < 


<: h 


□ a 






*— < 


U CD 




U LD 


U CD 


< 


u a 


a cd 


< f- 




< t- 


CJ CD 


p u 


< I- 


H < 


< [-* 




t- < 


CD U 




cj a 


<: t- 


U CJ 


CJ CD 


£ < 


< 


CD P 


a cj 


< H . 


H < 


p p 


a u 


a ld 


" LD U 


CD u 





<L I— 
CJ P 
LD U 

<: 
u P 

a o 

< f- 
<r k 

o ca 

U LD 

< H 
U CD 

u u 
I- <t. 

H < 

U LD 
U O 
CD U 

H <t 
U CD 
<C H 
U CD 
U CD 



H 
CD 



<C 
H 

U 



< 

^ <c 
ta u 

< H 

U CD 

< 

CD U 

e» u 

CD CD 

f- < 
i5 U 
<t H 

u a 

U CD 
K<C 

O O 

u u 
^ <: 
u u 

CD U 
H <t 
U U 
. CD CJ 
H < 



I- < 



B 

CD 

a 



CD 

u 
u 



o a 
t- < 
u u 
< f- 

CD CJ 

a o 

CD CJ 
U CD 
a U 

U C3 
U CD 

* a cd 
J" ? ■ 
<: h 

U CD 

H < 

CD U 



tD a 
cd a 

CD U 
CJ S 
CD U. 
CD U 
< H ■ 

cj a 

H < 

K < 
CD CJ 
f- < 
CD U 

a cd 

U CD 
C3 G 
H<. 
U LD 

CJ CD 

u 3 

< H 

cd a 

CD U 

< H 
H < 

U L3 



a o 

H < 

a cd 

< k 

LD U 
O U 

f- <: 
cd a • 

CD CJ 

h* < 

CD U 

a cd 
• a cj 

< H 
QU 

< H 

H < 
UP 

H <C 
LD U 

< H 
LD CJ 
<• H 

. U P 
CD P 
U P 
P P 

< H 



< H 

< H . 
U P 
P P 
H < 

a p 
a p 
t- < 
p u 

H <C 

a p 

UP 

p p 
p p 

< h- 

< H 
U P 
P P 

< H 

P P ; 
H < 
P CJ 
<C f- 
P CD 

h <: 
5 H 

p p 

p p 

H < 
P CD 

P P 
U CD 
P P 
P P 

H < 

P P. 

< H 
P P 
P U 

' P P 
• J" < 

S H 

< H 

< H 

< H 
U P 

. p p 
cd a 
p p 

t < 
p p 

6- <C 



0020251 



-59- 



00015Y 



CJ o 

<J O 

u a 

a a 

< 

u a 

h < 

u a 

f- <t 

a a 

u a 

H < 

H < 

< H 
H < 

< H 

a a 

h <: 
< 

t- < 

H < 

t- <C 

H < 

C3 O 

U CJ 

a u 



o 
a 

CD 

a 
a 
o 



a 
cj 
u 
< 

CJ 
< 

u 
< 
u 



a u 
u a 
< 
u cj 

CJ U 
U CJ 
H <C 
<C H 
'*• < 

H <C 
O U 
CJ U 
I— <t 

u u 

U CJ* 
I- < 
(J U 

h <c 

H < 



a cj 

p u 

Kl < 

J- < 

b a 

a a 

u cj * 

O CJ 

a u 



< 

■8 



< 

< 

< 

U 

u 



t- < 



a 



s- 

•p o 

h < 

cj u 
a a 
h <c 

«- <t 
t < 

U CJ 
H < 
S- < 

a u. 

H.<C 
Ch 

u a 
«- <t 
u 3 

u f J 
u a 

< H 
H < 

a cj 
cj u 



u 

CJ 
U 

u 

O 



8 

< 

C3 
< 

(j 



CJ CJ 



u CD 

< k 

U CJ 
CD U 
H < 
U U 
CJ CD 
<C H 
<C 5- 
<C '~ 

a a 

CJ t? 



< 

a 
u 

•2 

CJ 
CD 



E- 

CJ 

a 
a 



cj p 

a a 
< h 
u CJ 

u p 
< 

a cj 



u 
< 

CJ 
CJ 



8 

u 



a u 

H < 
H < 

< £ 

U CJ 
H < 



< 
I- 



CJ 

< 
a 



p u 

u a 
cd a 

u CJ. 

< 
U CJ 

a cj 

u CJ 

H <c 
CJ u 
K<C 
UCJ 

pu 

pu 

< f- 

u CJ 

$- ^ 
ucj 
u a 

a cj 
^< 

CJ u 

b< 
ucj 

"CJ CJ 

88 

< H 
UU 
{-« 



-60- 



0020251 

0O015Y 



CJ o 

o cd 

U CD 

h < 
< 

a a 
cd u 
cd u 

(3 u 
5 

u u 
u a 
!- « 

cd a 
a u 

H < 

o a 

< H 

ucd 
t- < 

H < 
P U 
J- < 

fc < 

H < 

< H 

u cd 
u a 
. cd u 

H < 

U U 

u u 
o u 

H < 
*• < 

u u 
i- <r 

cd a 
u cd 
u cd 



< H 
U U 

a o 

< t- 
u a 

< v« 

CD CJ 
- H <C 

U CD 
H < 
CD U 
«t J- 

< H 
U CD 
U CD 

a o 
a u 
cd u 

cd a • 
a a 

H < 

< H 
H < 
CD U 

a cj 
a u 

< 

LD U 
US 

a u 
u cj 

<H 

U CD 
ID U 
< H 
U CD 

*- s 
ucd 

a cj 
a o 
i- < 
h 5 

cd u 

u a 

u u 



CJ CD 
< H 

<C H 
I- < 

*- <: 

t- < 
< 

U CD 
C H • 

H < 
< H 
H <£ 
CD U 

a a 
u u 
I- <t 
cj.cd 
h < 

U CD 

h* < 
CD CJ 

I- < 
h < 

UCD 

!- < 

fc < 

IS 

So 

L3 (J 
H < 

cd a 
t- < 

U (3 
CD U 
CJ CD 

a o 

< H 
H <t 

< H 

h <r 

a o 
u u 
a cd 

H < 
CD U 

p U 
o u 

U CD 



U O 

< • 

< H 

a u 

CD U 
H < 
H < 

- a p 

< H 
H < 

O LD 
CD U . 
CD U 
CD C3 
H < 

< H * 
U CD 
f- < 
H < 

a cd 

< -. 

< H 

< H 
f-» < ■ 

au 
a cd 

U C3- 
H < 

^< 

55 
SB 

CD P 

< h 

< 

U U 

< h 

u p 

< H 
H < 
U CD 

. U CD 



H < 

H < 
O O • 

< H 

< H * 

< I- 

< H * 

u a 

f- < 
H < • 
H < 

a u 
h < 
u a 

u a 

< i- 
.< ** 

< H 
CJ u 

< 

< H 
U U 

< H 

< H 

< H 

< I- 
OCD 

is* 

a o 

H « 

< H 
CD U 
CD U 

< H 
U O 

<: h 

U CD 

u a 

. CD U 

* h< 
u a 
<: h 

CD O 

• ta o 

CD U 

cd a 



H < 
H < 
H < 
U CD 

ld a 

■ea- 
ts 

H < 
U CD 
H < 
CD U 

cd a 
cd a 

cd a 

H < 
H < ^ 

CD u 
<: f- 
<: ^ . 

< t- 
a cd 
h < 

CD U 

CD p 

"< t 

< H 

CD U 
CD CJ 
H < 
H <. 

cd a 
au 

O CD 

ta o 

• S3 

a cd 

UCD 



< H 

a cd 

< t: 

H < 

OCD 
CD U 
H< 
<H 
.H < 
CD U 
H < 
H < 
I- < 
U CD / 

cj a 

CD U 

< H 
t- < 

a cd 

CD U 
1^ < 

u a 

s*- 

SG 
B5i. 

< H 
U U 

< H 
U CD 

< H 

: h « . 

U CD 

. a cd 

H < 
U CD 

ld a 

CJ CD 
CD U 



-61- 



0020251 

00015Y 



II it ll.lt: U 18 II 

is . |§ g| gg BS II ii 

Is y <g si II. ii II 

ei IE e§ II U M M 

ii it ii. n ii ii ii 

i ..ii ii n ■ 

*-« uu So u8 tiS - • 8u 

uu ' og ^ • 8g . gg" ••>»« 

U U (JO O U £ 5r e 2 h ^ 

88 S £ < £ • uu S< So oS 

uo oo- So .«£ . [:< S2 88- 

tl- 88 88 S& St Sfc <& 

S2 S2 .,8* 88 • 88 

8-8 .-88 / 5£ -is IS 88- 

8g §8 S2 . 88 -88 S2 : S3 

is t< 88 55. 5£ 88 

£ £ H 3 « u u 1 u o- o 2 • u 2 

H 8 Sb - "53 *■ < <E • 88 

UU <H OC3.. UU <.H • UU UO 



0020251 



f 



-62- 



00015Y 



cd u 
I- < 
a u 

H < 

u cd 

O U 

:u CD 
to 

h" < 
CJ CD 
• H < 
CD U 

a u 



a 
y cd 

h < 

b < 

CJ CD 
I- < 

< s 

cd a 

cd a 

a a 

cd u 

cd u 

a u 

.8 6 

8B 

a u 
u cd 
a cd 
.a u 

< H 
a cd 
P 8 

H < 
f- < 

cd u 
cj S 
a a 
cd u 
k < 
u S 

CD S 



< 

U CD 

u cd 
cd a 

< H 
C3 CJ 
CD U 
H < 

< H 

CJ CD 

H < 
H < 

cd a 
a o 

< h 
a a 
cd u 

H < 

a cd 
u a 

u (J 

< h 

a o 

H < 
f- < 

u a 
cd u 

u CD 
H < 
H < 
• U CD 

< I- 

u cd 
cd u 

cd a 

H < 

cd a 
u cd 
u a 

CJ p 

cd a 
a a 

M CD 

cd CJ 
u cd 

H < 

< I- 
U CD 
H <£ 

u u 

H < 
H < 

a cd 
a cd 



cdo 

< H 
O CD 

CJ CD 
t— <c 
/Ji CD 
<t H 
CD U 
CD U 

H < 

CJ CD 
H < 

a u 

<C H 
CD U 
CD U . 

< H 
CD U 

< I- 

<: 

a a 

H <E 

cd a 
o 5 

O CD 

a cd 

p a 

U CD 

a cd 

CD U 

< t- 

5S3 

u a 

a cd 
cd a 
a a 

< i- 

< H 

go 

CD U 
CJ CD 

a cd 



CD O 

< K 

a u 

CD u 

p a 

U CD 

< H 
CD U 

CD U 

< H 
<C I— 

h < 

h < 
h < 

CD U 
H < 
CD U 
H < 

CJ u 

cj 5 

cd a 
< 

< j- 

< t- 

U CD 

a cd 

CD U 
CD CJ 

< H 
CD CJ 

< 

O CD 

a cd 

< H 
CD (J 
U CD 
O CD 

< H 
CD U 

up 

< r* 
U CD 

CD 

^5 

< I- 

< K» 
U CD 



CD U 

a u- 
< 

< H 

<: h 

< H 

u u 
a cd 
cd a 

CD U 

< H 
<C 

CD U 

t-i <: 
a cd 
cd a 
CD cj 

< H 

t- < 

cd a 
h < 

H < 

a 3 

H < 
CD U 
CD U 
< 

h 

o a 

CD CJ 

CD U 
< H 
. CD CJ 
CD U 

S 

CD CJ 

CD U 

. CD U 

CD CJ 
CD CJ 

* H < 

a cd 

CD CJ 



a cd 

U CD 

h <n 

CD U 

< I- 

CJ CD 

< f- 
H < 
CD CJ 

< 

a cd 
a cd 

H < 

< H 

a a 
h < 

< 

H < 
CJ CD 

a cd 

CD CJ 
£- <C 

cj a 
h <: 

as 
ss 

f- < 
H < 
H < 
t- < 
H < 

u p 

< h 

< H 

CJ CD 
CD CJ* 

U CD 
CD U 
<H 

. U CD 
UCD 

< H 

, CD O 
U CD 

cd a 
a cd 



< £ 

H < 
H < 

u a • 

U CD 
•O CD 

<r H 

CD U 
H <c 
H < 

< H 
U CD 

< H 
CD U 
CD U 
•H < 

< H ' 
O CD 
CD U ' 
CD a 

a cd 

CD U 
CD U 
H < 
CD U 

CD CJ 
CD CJ 
h < 

U CD 
U CD 

cd a 

H < 
CD U 
. H < 

CJ CD 
. CD U 

< H 

< H 
U CD 

p a 
*- < 

CJ CD 
CJ CD 
CD U 

: <h 

* < H 

. a cd 

" h < 
< 

P cj 
a cd 

< H 
U LD 



0020251 



-63- 



00015Y 



H 
U 



LD 



a cd 



H < 

a u 



H 
U 
H 



< 
LD 



I— <C 
U CJ 

< H 
u cj 

a ld 

H <C 
CJ LD 
U.CJ 

CD CJ 
H < 
H < 
H- < 
H < 
H <t 
CJ U 
U C 
H < 
U CD 

H < 

CJ CJ 

< K 

a u 

CJ u 
CD u 



cd 

U 
<E 
H 
U 
LD 
<C 
L3 
CD 



U 
< 

a- 
< 

LD 
U 

U 

u 



h 5 

f- < 

< H 

< H 

cj a 

< H 

< t* 

< H 



U 

U 

CJ 
< 



LD 
< 
LD 
< 
U 
H 



CD U 

< H 
H < 

a ld 

U CD 

cd a 

< H 

? *- 

a a 

< h 

CD U 

U CD 

e- < 

< 

cd u 

f- < 

U CD 

H < 



U 
LD 
< 
CJ 

u 



u 

E— 

cj 
< 

LD 



U LD 

ld a 



a 
u 



cd 



cj a 

a ld 
cj cj 

H < 
U U 

H <C 
< H 

cd a 

U CD 

fr* <: 

CD U 



< H 

u a 

H < 

H < 
• < 

< f- 

cj a 
u u 

CD U 

CD CJ 
CJ U 
LD CJ 
f- < 

a cd 
a a 
<c 
cj cj 

H < 
CJ U 

f- < 
H < 

< H 
U CD 
CJ CD 

cj a 
< 

a o 

a u 

CD u 

< H 
U CD 
f- ^ 

u a 
-c h 
a cd 
cj u 
h <t 

U CD 

<C H 
H < 

• < H 
- U CD 

a o 

U CD 

*h <: 
u u 

< b« 
u u 

- S < 

U (J 

C3 O" 

h 5 

< H 

a u 

CD U 



H < • 

LD U 
<C H 

CD U 
< 
I- < 

a cd 
o u 

CD (J 

cd a 

<c 
a cd 
t- <c 
< 

CJ LD 



o 
u 

< 
u 



I- 

H 
U 

H 

a 

CJ 



• h < 
- 5s < 

H <r 
< 

< H 
LD U 

C5 a 

. CD-U 
CD U ■ 
CD CJ 

. a o 

u (J 
. U S 

*:CD U 
%• < H 

u a 

U CD 

< H 
CD U 



U CD 

< I- 
■ H <C 

< H 
H < 

< f- 
U LD 
H < 
H <C 

H < 

CD a 
cd a 

H < 
CD U 
H <C 
H <C 
' < H 
H < 
U CD 

<r 

u a 

. CD CJ 
CD CJ 
< 

CD U 

< K 

< H 
- < h- 

H < 
H < 

* CD CJ 

CD U 

LD CJ 

H < 

CD U 
U CD. 

< H 
• < h* 

H < 
U U 

r < H 

H< 

H < 
CD CJ 
H < 

< H 
H < 
H < 

< H 

< H 

< H 



<r f- 

CD CJ 
CD O 

I- < 
a u 
f- < 

CD CJ' 

<: h» 
a cj 

P cj 

^ < 
*- < 
a a 
<t 
a cd 
t- < 
a u 

P u 

h < 
c— <: 

t-^ < 

H < 

< H 
PCJ 
H < 

U CJ 

< H 
t- < 

CJ u 
{- < 
cj a 

< H 
CD CJ 

< H 

ld a 
<: h 
cd a 

< H 

< H 

cd a 

LD U 

h < 

H < 

^ < 
CJ LD 
C I-- 
: H < 
*-* <Z 
U CD 

cj S 
cd a 

H < 



a a 

< 

a cj 
cd a 

CJ CD 
U CD 

5 
t- < 

cj a 
< 

a ld 

<: 
< 

CJ CD 
^ < 

< f- 

H < 

U CJ 
H < 

< *r 

f-» <c 

a cd 
cj a 

CJ CD 
u CD 

a cj 

CJ CJ 

cj a 
< 

a cj 
< 

CD CJ 

< H 
H < 
< 

U CD 

B8 

U CD 
CJ CJ 

'H < 
CJ CD 
U CD 

CJ O 
^ H. 

U H 
CD CJ 

a ca 



cj u 

CJ CD 

fr- < 

U CD 

CJ CJ 

CD U 

CJ CD 
< 

U CJ 

CJ CJ 

O CD 



a 
< 
< 

CD 
< 

< 

< 

CJ 

£ 

CJ 
CJ 



CD 
H 
H 
U 

H 
CJ 
H 

f- 
U 
H 

<■ 
CJ 
CD 



U CD 
CJ CD 



U 

CD 
< 
U 
CD 
U 
< 

.8 

a 
•< 

a 

CD 
CD 
U 
< 



< 

u 

CJ 
H 

a 

&■ 

CD 
CD 

a 
a 

H 

a 

CJ 

B 

p u 

< 
i- < 
u u 

u u 

< H 
t- < 



0020251 



-64- 



00015Y 





^ r*» 


• - 

^» 






Urn. 












CJ LJ 


r- <. 




U CJ 


UJ LJ 


r « t*\ 
«— 3 Vj 




m - 


LJ U 


rr 


<T 

*t ~ 








< H 


H < 




^ < 


< t- 


< *- 


o a 


ca u 




<: J- 


u a 


• <: h 


cd a 




a a- 




a a 


<r h 


a cj 


<c t- 


cj u 


< t- 


cn a 




a u 




i- <t 


<t i- 


< H 


< H 


<z t- 


U CD 


<: h 


a cj 


H < 


<r h- 




a cd 




H < 


a u. 


u a 


< h 


cd a 


a a 


cd a 


u u 






a a 


cj a 


H <t 


h < 


< H 


a a 


CJ CD 


t- < 


t- <: 


< H 


cd a 


< 


< H 


< 


< H 


h < 


a cd 


H. < 


u cd 


a a 


H < 


< 




i- < 


H < 


H < 


< h 


H < 


H < 


< H 




<r h 


U CD 


< t- 




H < 


U 'J 


3 a 


cd u 




< H 


k < 


a u 


< f- 




< H 


cj L3 


.o u 


< H . 




< fr" 




u a 






t- < 


< 


a cd 


u a 


y u 


uu 


CD U 


2 *- 




< H - 




< £- 


H < 


H < 


U U 


U CD 




H < 


H < 


H < 





a cd 

CD CJ 
CJ CD 

< 
cd a 

U CD 

a a 
u a 

u a 
u a 
u a 
t- < 

< H 
CJ CD 

u a 

H < 
U CD 
H < 

< f- 

CD U 
<C H 
U CD 
CD U 

u a 

< i— 
a u 

U CD 



H < 

u a 

< f- 

< H 

cd a 

CD O 
CD U 
H < 
CD U 

CD CJ 

< H 

< H 
^ < 

u u 
< 
u u 

CD CJ 

CD U 

H < 
U CD 

a cd 



u CD 

u o 

H < 
H < 

a a 
a cd 

u a 

< h 

< H 

U CD 
CD U 

cd a 
h <: 
< 

cd a 
h < 

u o 
u o 

H < 

<t H 

< H 
H < 
H < 
t- < 

a u 



CJ CD 
H < 
U C3 

a 3 

CD u 
CD a 
CD 

CD U 
H < 

< 

< H 

< H 
CJ CD 

< H 
<C H 
U U 
^ * 

CD U 
H < 

cd a 

CD-CJ 

< H 

s 

h < 



u a 

u CD 
U CD 

i- < 
u u 
a a 

H < 

a cd 

H < 
H < 
<t t- 
C3 U 
<t H 
H < 

a u 
a u 

CJ CD 
CJ CD 
CD CJ 
H < 

H < 
H < 
<C H 
< H 



U CD 

< H . 
!- < 

■H < 

< H 

£<• 

< H 
U CD 

<t H 
U U 

< H 

a cd 

U CD 
<t H 

< H 

< *- . 
UCD 
U U 

h< 

a o 

< H 
W < 
H< 

< H 
U CD 
H< 

< H 

<H 

< 

^ < 

CD U 

H < 
CD U 
CD U 
< 

a cd 
h< 
cd a 
u a 

U CD 
H< 

< H 
H< 
H < 

■ 5^ 

' U CD 
CJ CD 

< H 

5 H 
h < 



H < 

H < 

t- < 

< H . 
CJ (J 

H < 

< H 
U CD 

a cj 
o a 

CD U 

< H» 
H < 

cd a 
a cd 

< H 

a cd 

< K 
a cd 

u a 

< h 

< H 

< t- 

u u 

CD CJ 
CD CJ 

cj a 

CD O 

< H 

a a 

H < 
H < 

< H 

cd a 

CD CJ 
t- < 

U CD 
. CD CJ 
CD CJ 

< 

CD CJ 
." CD CD 

• H 5 

- H» <• 
U CD . 
(- < 

a cd 

< ^ 



0020251 



-65- 



00015Y 



H <C 

U CJ 

u cd 

<C H 

< I'- 
ve H 

< H 
U CD 

< H 
U CJ 

u a 
a u 
h < 

cj u 
a u 

< H 
CD U 

cd u 
cj u 
h.«: 

.<t H 
(J CJ 
CD. (J 

<r h 

U CJ 

< H 
H < 

U CD 
CD CJ 

<r f-» 
cd u 

< h 

CJ CD 

< H 

< H 
CJ U 

a cj 
a o 

u ca 
h < 

H < 

< H 

: U CD 
U 3 

<t H 
U CJ 
H < 

• CD U 

cd u 
a u 

u CD 

cj u 



< 

a 

< 
cd 
a 



CD* 
CJ 

a 



cj CD 



U 

H 
< 
CJ 
CD 



CD 
< 
< 
H 
CJ 

a 



cj u 
h <c 

O CD 
f— < 

o cd 
a u 

< t- 

< *7 

U CD" 
U CD 

u cj 

H < 
H < 
CD U 

e- < 
cj CJ 

t- < 
h < 

U CD 

u u 

U CD 

u a 

CD CJ • 
CD CJ 

a u 
h < 

a o 
a o 
cd p 

< £ 

< h 

< t- 

U CD 
CD CD 

a cd 



O CD 
t- < 
CD CJ 

a cd 
a cj 

U CD 

•cj cd 

< t- 

< H 
U CD 



H 

H 

u 
< 

CD 
U 
CD 
H 

< 



< 
CD 

a 
u 
u 
< 
< 



cd a 
u a 

U CD 
t Z < 

< H 

< H 

a cd 

< t- 

U CD 
t- < 

a o 

5 S 
<. h 

o cj 

a cd 

< ^ 

CD U 
O O 
U CD 
K < 
t- < 
C H 
U U 
CD U* 
H < 
CJ Cd 

u a 
o a 

< S 

. cd a 

cd a 
<: h 

U CD 



a u 

U CD 

< K 

a cd 

a cd 

CD U 

o a 



ea a 



u 

u 

CD 
CD 
CD 
U 
H 
H 
< 

a 



CD 

U 
CJ 



u p 

CD U 

CD U 

CD U 

H < 

CD CD 

< f- 
U CJ 
CD U 

CD U 

< t 

< H 
U O 
U CD 

<r £ 
a a 



CD 
CJ 

8 

(J 



CJ CJ 



CJ 
CD 

CD 
CD 
< 

a 
u 

"CD 
CD 



CD 

a 

CD 



u 

CD 

u 

CJ 



CD 
< 

a 
a 

3 

*■* 
< 

a 

CD 

a 
c? 
< 



a 
a 

CD 
CJ 

u 
< 

J5 

CD 
U 

a 
u 

H 



CJ 

U CD 

a u 

CD U 
ft 

p a 

a cd 
u a 

u CD 

CD 
CD 
U 
CJ 



U 
CJ 

a 
u 
u 
u 

^ < 

h < 
< 

H < 

o u 

CD U 

a cj 
o u 

CJ CJ 
CJ CD 
< H 
U CJ 
<Z H 

a a 

U CD 
H < 
CJ CJ 
CJ CD 



a 

CJ 

o 
a 

(J 

5 



CD 
CJ 
CD 

3 



a a 

u CD, 

cd a 

< H 
U CJ 
CJ u 
CJ CJ 

< H 

< 

cd. a 

CJ CJ 

< H 
U CJ 

t- < 

CD CJ 

< H 

U CJ 
CJ u 
CD CJ 

a cj 

<t H. 

< ^ 

U CJ 

cj p 
a cj 

CJ CJ 
H < 
U CJ 



a 

CD 
H 

a 

CJ 

u 

H 

a 



CJ 
CJ 

< 

CJ 
CD 

< 

CJ 
CJ 

< 

CJ 



a cj 

H < 
H < 

< H» 

a cd 

< J- 

U CJ 



cj a 

CJ CJ 

< H 
U CJ 

a u 

H < 

U CJ 

a cj 
CJ u 

55 ° 

< H 

o CJ 

h- < 
U CD 
CJ CJ 

< H 
UCJ 
H < 

CJ u 
a cj 

< H 
CD U 

< H 

cj u 
5 H 
^ < 

tJ CD 
H ^ 
O CD 
CJ CJ 

< t- 

U CD 

a cj 
p " 



- 66 - 



0 0 2&M 1 



TABLE 2 

Base sequence and Translation of HBV-DNA. Starting 
point designated 0 "in FIG. 4 . 

@ - methionine start signals 

• - termination codons 

A — A protein 

B = B protein 

C « core Antigen 

D = D protein 

S = S protein 
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